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Introduction 


About this book 

This book has been written to cover the Cambridge International AS & 
A Level Mathematics (9709) course, and is fully aligned to the syllabus. 
In addition to the main curriculum content, you will find: 


e ‘Maths in real-life’, showing how principles learned in this course are 
used in the real world. 


e Chapter openers, which outline how each topic in the Cambridge 
9709 syllabus is used in real-life. 


The book contains the following features: 


Did you know? 


:EXAM-STYLE QUESTION 


Advice on 


calculator use 


Throughout the book, you will encounter worked examples and a host 

of rigorous exercises. The examples show you the important techniques 
required to tackle questions. The exercises are carefully graded, starting 
from a basic level and going up to exam standard, allowing you plenty of 
opportunities to practise your skills. Together, the examples and exercises 
put maths in a real-world context, with a truly international focus. 


At the start of each chapter, you will see a list of objectives covered 

in the chapter. These are drawn from the Cambridge AS and A Level 
syllabus. Each chapter begins with a Before you start section and ends 
with a Summary exercise and Chapter summary, ensuring that you fully 
understand each topic. 


Each chapter contains key mathematical terms to improve understanding, 
highlighted in colour, with full definitions provided in the Glossary of 
terms at the end of the book. 


The answers given at the back of the book are concise. However, you 
should show as many steps in your working as possible. All exam-style 
questions have been written by the author. 


About the author 


James Nicholson is an experienced teacher of mathematics at secondary 
level, taught for 12 years at Harrow School as well as spending 13 years as 
Head of Mathematics in a large Belfast grammar school. He is the author 
of two A Level statistics texts, and editor of the Concise Oxford Dictionary 
of Mathematics. He has also contributed to a number of other sets of 
curriculum and assessment materials, is an experienced examiner and has 
acted as a consultant for UK government agencies on accreditation of new 
specifications. 

James ran schools workshops for the Royal Statistical Society for many 
years, and has been a member of the Schools and Further Education 
Committee of the Institute of Mathematics and its Applications since 2000, 
including six years as chair, and is currently a member of the Community 
of Interest group for the Advisory Committee on Mathematics Education. 
He has served as a vice-president of the International Association for 
Statistics Education for four years, and is currently Chair of the Advisory 
Board to the International Statistical Literacy Project. 


A note from the author 


The aim of this book is to help students prepare for the Statistics 2 unit of the 
Cambridge International AS and A Level Mathematics syllabus, though it 
may also be found to be useful in providing support material for other 

AS and A Level courses. The book contains a large number of practice 
questions, many of which are exam-style. 


In writing the book I have drawn on my experiences of teaching 
Mathematics, Statistics and Further Mathematics to A Level over many 
years as well as on my experience as an examiner, and discussion with 
statistics educators from many countries at international conferences. 
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PROBABILITY & ISTICS 2 


lent Book 


Syllabus overview 


Unit S2: Probability & Statistics 2 (Paper 6) 


1. The Poisson distribution 


* Calculate probabilities for the distribution Po(A) Pages 3-9 

© Use the fact that if X ~ Po(A) then the mean and variance of X are each equal to A Pages 10-11 

e Understand the relevance of the Poisson distribution to the distribution of random Pages 12-15 
events, and use the Poisson distribution as a model 

e Use the Poisson distribution as an approximation to the binomial distribution where Pages 21-22 
appropriate (n > 50 and np < 5, approximately) 

¢ Use the normal distribution, with continuity correction, as an approximation to the Pages 23-25 


Poisson distribution where appropriate (A > 15, approximately) 


2. Linear combinations of random variables 


¢ Use, in the course of solving problems, the results that: 


— E(aX +b) = aE(X) + b and Var(aX + b) = a’Var(X) Pages 29-33 
— ElaX + bY) =aE(X) + bE) Pages 34-38 
— Var(aX + bY) = a*Var(X) + b*Var(Y) for independent X and Y Pages 34-38 
- ifX has anormal distribution then so does aX + b Pages 50-54 
- ifX and Y have independent normal distributions then aX + bY has a normal distribution Pages 50-54 
— ifX and Y have independent Poisson distributions then X + Y has a Poisson distribution Pages 47-49 


3. Continuous random variables 


e Understand the concept of a continuous random variable, and recall and use properties Pages 59-61 

of a probability density function (restricted to functions defined over a single interval) 

e Use a probability density function to solve problems involving probabilities, and to Pages 61-73 

calculate the mean and variance of a distribution (explicit knowledge of the cumulative 

distribution function is not included, but location of the median, for example, in simple 
cases by direct consideration of an area may be required) 
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4. Sampling and estimation 


e Understand the distinction between a sample and a population, and appreciate the Pages 77-79 
necessity for randomness in choosing samples 


e Explain in simple terms why a given sampling method may be unsatisfactory (knowledge Pages 79-86 
of particular sampling methods, such as quota or stratified sampling, is not required, but 
candidates should have an elementary understanding of the use of random numbers in 
producing random samples) 


* Recognise that a sample mean can be regarded as a random variable, and use the Pages 86-94 

facts that E(X) = y and that Var(X) = £ 
* Use the fact that X has a normal distribution if X has a normal distribution Pages 94-96 
e Use the Central Limit Theorem where appropriate Pages 96-100 


* Calculate unbiased estimates of the population mean and variance from a sample, using Pages 109-115 
either raw or summarised data (only a simple understanding of the term ‘unbiased’ is 


required) 

* Determine and interpret a confidence interval for a population mean in cases where the Pages 116-121 
population is normally distributed with known variance or where a large sample is used 

¢ Determine, from a large sample, an approximate confidence interval for a population Pages 121-124 
proportion 


5. Hypothesis tests 


« Understand the nature of a hypothesis test, the difference between one-tail and two-tail Pages 128-137 
tests, and the terms null hypothesis, alternative hypothesis, significance level, rejection 
region (or critical region), acceptance region and test statistic 

¢ Formulate hypotheses and carry out a hypothesis test in the context of a single Pages 142-149 
observation from a population which has a binomial or Poisson distribution, using 
— direct evaluation of probabilities 
— anormal approximation to the binomial or the Poisson distribution, where 

appropriate 

e Formulate hypotheses and carry out a hypothesis test concerning the population mean Pages 153-161 
in cases where the population is normally distributed with known variance or where a 
large sample is used 

« Understand the terms Type | error and Type Il error in relation to hypothesis tests Pages 137-139 

* Calculate the probabilities of making Type | and Type II errors in specific situations Pages 140-141 
involving tests based on a normal distribution or direct evaluation of binomial or Poisson 
probabilities 


1 The Poisson distribution 


The Poisson distribution can be used to (at 
least approximately) model a large number of 
natural and social phenomena. You might not 
expect the number of photons arriving at a 
cosmic ray observatory, the number of claims 
made to an insurance company, the number 
of earthquakes of a given intensity and the 
number of atoms decaying in a radioactive 
material to have much in common, but 

they are all examples of this distribution. 

The photo is of VERITAS - Very Energetic 
Radiation Telescope Array in Arizona — which 
is helping to shape our understanding of 

how subatomic particles like photons are 
accelerated to extremely high energy levels. 


Objectives 

After studying this chapter you should be able to: 

e Calculate probabilities for the distribution Po(A). 

@ Use the fact that if X ~ Po(A) then the mean and variance of X are each equal to A. 


e Understand the relevance of the Poisson distribution to the distribution of random events, 
and use the Poisson distribution as a model. 


Before you start 


You should know how to: Skills check: 

1. Use your calculator to work out values of 1. Find the value of: 
exponential functions, e.g. a) e&? 

Find the value of e** b) et! 
e25 = 0.0821 (3 sf) 

2. Substitute values into more complex 2. Find the value of p= aa Es 
formulae, e.g. ‘ 
Find the value of p= oe x2.5t 

4! 


e7* x2.5" _ 0,0821x39.06 


7m or =0.134 (3 d.p.) 


1.1 Introducing the Poisson distribution 


Think about the following random variables: 


The number of dandelions in a square metre of a piece of open ground. 
The number of errors in a page of a typed manuscript. 
The number of cars passing a point on a motorway in a minute. 


The number of telephone calls received by a company switchboard in half an hour. 


The number of lightning strikes in an area over a year. 


Introducing the Poisson distribution 


Do they have any features in common? Does any one of them stand out 
as being rather different? 


The behaviour in five of these photos follows the Poisson distribution. 


Formally, the conditions are that 

i) events occur at random 

ii) events occur independently of one another 

iii) the average rate of occurrences remains constant 

iv) there is zero probability of simultaneous occurrences. 


The Poisson distribution is defined as 


P(X=7r)=£ for r=0, 1,2, 3, ... 


rl 


You need to have a value for A in order for this to make sense, so there is 
a family of Poisson distributions but there is only one parameter, A, which 
is the mean number of occurrences in the time period (or length, area or 
volume) being considered. 


You can write the Poisson distribution as X ~ Po(A). 


Example 1 
If X ~ Po(3) find P(X = 2). 


P(X=2)=£ 3 = 0.224 (3s) 


Example 2 

The number of cars passing a point on a road during a 5-minute period 
may be modelled by the Poisson distribution with parameter 4. 

Find the probability that in a 5-minute period 

i) 2cars go past ii) fewer than 3 cars go past. 


X ~ Po(4) 


i) P(X=2)= — = 0.146525... = 0.147 (3s.£) 


200 
ii) P(x=0)=£ =< = 0.01831... = 0.0183 (3s.f) <—— Remember that 0! =1 and 4°= 1 


ae 
P(X=1)=* + = 0.07326... = 0.0733 (3s.£.) 


P(X < 3) = 0.01831... + 0.07326... + 0.146525... 
= 0.238 (3s.f.) 
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Mathematical note: It is not immediately obvious from the 
mathematics you cover in this course that the form of the Poisson 
distribution constitutes a probability distribution - remember from 
S1 Chapter 5 this requires all probabilities to be non-negative (which 
they obviously all are here because exp(—A) >0 for any value of A) but 
also that the sum of the probabilities is 1. 

P(X=r)=£ “2 for r=0, 1, 2, 3, ... isa probability distribution 


r 


because S( 4 )=1+ 24444 424... <eh this isan example 


=0 


of an advanced topic in Pure Maths where functions like exponentials, 
logarithms and the trigonometric functions have (infinite) power 
series forms. Truncated forms of these infinite series are how 
electronic calculators obtain values of these functions. 


Exercise 1.1 


1. IfX~Po(2)find i) P(X=1) ii) P(X=2) iii) P(X =3). 
2. IfX~Po(1.8)find i) P(X=0) ii) P(X=1) iii) P(X =2). 
3. IfX~ Po(5.3) find i) P(X=3) ii) P(X=5) iii) P(X =7). 
4. IfX~ Po(0.4) find i) P(X=0) ii) P(X=1) iii) P(X =2). 
5. IfX~ Po(2.15) find i) P(X=2) ii) P(X=4) iii) P(X =6). 
6. IfX~ Po(3.2) find i) P(X=2) ii) P(X <2) iii) P(X > 2). 


7. The number of telephone calls arriving at an office switchboard in a 
5-minute period may be modelled by a Poisson distribution with 
parameter 3.2. Find the probability that in a 5-minute period 


a) exactly 2 calls are received 
b) more than 2 calls are received. 
8. The number of accidents which occur on a particular stretch of road in 


a day may be modelled by a Poisson distribution with parameter 1.3. 
Find the probability that on a particular day 


a) exactly 2 accidents occur on that stretch of road 


b) fewer than 2 accidents occur. 


Introducing the Poisson distribution 


1.2 The role of the parameter of the Poisson distribution 


The mean number of events in an interval of time or space is proportional 
to the size of the interval. 


Example 2 in Section 1.1 looked at the number of cars passing a point on 
a road during a 5-minute period. This may be modelled by the Poisson 
distribution with parameter 4. 


In this case, the number of cars passing that point in a 20-minute 
period may be modelled by the Poisson distribution with parameter 16, 
and in a 1-minute period may be modelled by the Poisson distribution 
with parameter 0.8. 


If the conditions for a Poisson distribution are satisfied in a given period, 
they are also satisfied for periods of different length. 


Example 3 


The number of accidents in a week on a stretch of road is known to follow 
a Poisson distribution with mean 2.1. 


Find the probability that 

a) ina given week there is 1 accident 

b) ina two week period there are 2 accidents 

c) there is 1 accident in each of two successive weeks. 


a) In one week, the number of accidents follows a Po(2.1) distribution, 


so the probability of 1 accident = ce at = 0.257 (3s.f.). 


b) In two weeks, the number of accidents follows a Po(4.2) distribution, 
ee 
so the probability of 2 accidents = oe = 0.132 (3s.f,). 


c) This cannot be done directly as a Poisson distribution since it says what has to 
happen in each of two time periods, but these are the outcomes 
considered in part a). 


So the probability this happens in two successive weeks is (< 
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Example 4 


The number of flaws in a metre length of dress material is known to 
follow a Poisson distribution with parameter 0.4. 


Find the probabilities that 
a) there are no flaws ina 1 metre length 
b) there is 1 flaw in a 3 metre length 


c) there is 1 flaw in a piece of material which is half a metre long. 


a) X~Po(0.4) = P(X =0)= sd = 0.670 (3 s.f.). 


b) Y~ Po(1.2) = P(Y =1) = £12’ - 0.36163 5). 


c) Z~Po(0.2) => P(Z=1)= Saks = 0.164 (3 s.f). 


Exercise 1.2 


1. 


The number of telephone calls arriving at an office switchboard in a 
5-minute period may be modelled by a Poisson distribution with 
parameter 1.4. Find the probability that in a 10-minute period 


a) exactly 2 calls are received 
b) more than 2 calls are received. 


2. The number of accidents which occur on a particular stretch of road in 


a day may be modelled by a Poisson distribution with parameter 0.4. 
Find the probability that during a week (7 days) 
a) exactly 2 accidents occur on that stretch of road 


b) fewer than 2 accidents occur. 


3. The number of letters delivered to a house on a day may be modelled by 


a Poisson distribution with parameter 0.8. 
a) Find the probability that there are 2 letters delivered on a particular day. 


b) The home owner is away for 3 days. Find the probability that there 
will be more than 2 letters waiting for him when he gets back. 


4, The number of errors on a page of a booklet can be modelled by a 


Poisson distribution with parameter 0.2. 

a) Find the probability that there is exactly 1 error on a given page. 

b) A section of the booklet has 7 pages. Find the probability that there 
are no more than 2 errors in the section. 

c) The booklet has 25 pages altogether. Find the probability that the 
booklet contains exactly 6 errors altogether. 


The role of the parameter of the Poisson distribution 


5. The number of people calling a car breakdown service can be modelled by 
a Poisson distribution, and the service has an average of 6 calls per hour. 
Find the probability that in a half-hour period 


a) exactly 2 calls are received 


b) more than 2 calls are received. 


1.3. The recurrence relation for the Poisson distribution 


You can calculate probabilities for a Poisson distribution in sequence using 
a recurrence relation. 


Example 5 
If X ~ Po(A) 


a) write down the probability that 
i) X=3 and ii) X=4 


my Cartas a Sesux An 
YD) a SI 

CLEA ie eA ee. 
Wa -( 3! )s 4 


The general relationship is P(X = k + 1) = GS x P(X =k). 


‘The graphs on the next page show the probability distributions for different values of A and 
what effect changing the value of A has on the shape of a particular Poisson distribution. 
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Poisson, A= 1.2 [= E(X)] 
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o14 
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Probability of x 


All Poisson variables have a sample space 
which is all of the non-negative integers. 
However, when J is relatively low, the 
probabilities tail off very quickly. 
12 12 12 12 
1 = 1,23 5) =0.6; 3 =0.4; 4 
so the initial probability that X = 0 is multiplied 
by 1.2, then 0.6, then 0.4, 0.3, ... and so the 
mode of X = 1. 


=F ion 


Poisson, A= 2.5 [= E(X)] 


— 
0123 4 5 6 7 8 9 10 11 12 


Here A is larger than in the previous graph and 
the peak has moved across to the right. 

For values of X which are less than A the 
probability increases, but once x is greater than 
A the probabilities start to decrease. 

More values of x have a noticeable probability, 
so the highest individual probability is not as 
large as it was in the previous graph and the 
distribution is more spread out. 


Poisson, A= 4 [= E(X)] 


Probability of x 
° 
& 
4 
J 
| 
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What happens when A is an integer? 

Here P(X = 4) = P(X = 3) x 4 = P(X =3) and 
the distribution has two modes ~ at 3 and 4. 
Generally, the mode of the Poisson (A) 
distribution is at the integer below A when A is 
not an integer and there are two modes (at A 
and A - 1) when it is an integer. 


Poisson, A= 0.8 [= E(X)] 


7 8 9 10 11 12 


A< lisa special case. 

Here even the first time the recurrence relation 
is used you are multiplying by < 1, so the mode 
will be 0 and the probability distribution is 
strictly decreasing for all values of x. 


Elgg = The recurrence relation for the Poisson distribution 


The general forms for the probabilities of 0 and 1 for a Poisson distribution are 


P(x=0)=£ x4 =e and P(X =1) =£ a4 =Ae* 


Since 5.8 is not an integer, the mode is the integer below it, 
i.e. the mode is 5. 


Exercise 1.3 
1. X~ Po(2.5) 
a) Write down an expression for P(X = 4) in terms of P(X = 3). 
b) If P(X = 3) = 0.214, calculate the value of your expression in part a). 
c) Calculate P(X = 4) directly and check it is the same as your answer to b). 
d) What is the mode of X? 
2. X~ Po(5) 
a) Write down an expression for P(X = 5) in terms of P(X = 4). 
b) Explain why X has two modes at 4 and 5. 
3. X~ Po(A) and P(X = 4) = 1.2 x P(X =3). 
a) Find the value of A. 
b) What is the mode of X? 
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1.4 Mean and variance of the Poisson distribution 


If X ~ Po(A), then E(X)=A; Var(X)=A= st. dev. (o)=VA. 


A special property of the Poisson distribution is that the mean and variance 
are always equal. 


Example 8 


The number of calls arriving at a company’s switchboard in a 10-minute period can be modelled 
by a Poisson distribution with parameter 3.5. 


Give the mean and variance of the number of calls which arrive in 


ii) Here A = 21 (=3.5 x 6) so the mean and variance will both be 21. 
iii) Here A = 1.75 (= 3.5 + 2) so the mean and variance will both be 1.75. 


Example 9 
A dual carriageway has one lane blocked off because of roadworks. 


The number of cars passing a point in a road in a number of 1-minute intervals is summarised in 
the table. 


Number of cars | 0 1 2 3 4 5 6 
Frequency | 3 | 4 | 4 [a5 [30 | 3 


a) Calculate the mean and variance of the number of cars passing in 1-minute intervals. 


b) Is the Poisson likely to provide an adequate model for the distribution of the number of cars 
passing in 1-minute intervals? 


2 = _ 228 
a) ))f=70, xf =228, >ix7f =836, so x = By = 3.26 (3 s.£) 


and Var(X) = Df _ 836 _ (28) =1.33 3 sf). 
SF 70 70 
b) The mean and variance are not numerically close so it is unlikely the Poisson will be an 
adequate model (with only one lane open for traffic, overtaking cannot happen on this stretch 
of the road and the numbers of cars will be much more consistent than would happen in 
normal circumstances — hence the variance is much lower than would be expected if the 
Poisson model did apply). 


Mean and variance of the Poisson distribution 


Derivation of mean and variance of the 
Poisson distribution 
You must be able to use these results but are not required to be able to prove 


them - they are included here for completeness, and as a nice manipulation 
using the power series expression for the exponential function. 


X ~ Po(d) <> Pr{x=k}= 4 
= TAF _. eA etn 
EX) = Dk x 7 =0x a + DLbX ii 
=Ax>D . r cancelling k, after discarding the zero case 
k=l a 
= 
2 Spy At . etAtt 
E(X y= DK x a ax Deke (DI =Ax > 
2 ag kel 
tax yk 
»» (k-1)! 
Then Var(X) =A? +A-H=A. 
Exercise 1.4 
lL. IfX~ Po(3.2) find i) E(X) ii) Var(X). 


2. If X ~ Po(49) find the mean and standard deviation of X. 
3. X ~ Po(3.6) 
a) Find the mean and standard deviation of X. 
b) Find P(X > 1), where wt = E(X). 
c) Find P(X > + 20), where ois the standard deviation of X. 
d) Find P(X <p- 20). 


4. X is the number of telephone calls arriving at an office switchboard in a 10- 
minute period. X may be modelled by a Poisson distribution with parameter 6. 
a) Find the mean and standard deviation of X. 
b) Find P(X > 1), where pt = E(X). 
c) Find P(X > + 20), where ois the standard deviation of X. 
d) Find P(X <p — 20). 


5. Compare your answers to part d) of questions 3 and 4. 
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1.5 Modelling with the Poisson distribution 


The Poisson distribution describes the number of occurrences in a fixed 
period of time or space if the events occur independently of one another, at 
random and at a constant average rate. 


Standard examples of situations in real-life which can often be modelled 
reasonably by the Poisson distribution include: 

radioactive emissions, traffic passing a fixed point, telephone calls or 
letters arriving, and accidents occurring. 


Example 10 
The maternity ward of a hospital wanted to work out how many births would be likely to happen 
during a night. 
The hospital has 3000 deliveries each year, so if these happen randomly around the clock 1000 
deliveries would occur between the hours of midnight and 8 am. This is the time when many 
staff are off duty and it is important to ensure that there will be enough people to cope with the 


workload on any particular night. 


The average number of deliveries per night is oat which is 2.74. 


From this average rate the probability of delivering 0, 1, 2, etc. babies each night can be calculated 
using the Poisson distribution. If X is a random variable representing the number of deliveries per 
night, some probabiliti s are 


P(X = 0) = 2.749 x = 0.065 


X= 1) =2.74'x T =0.177 


P(X = 2) = 2.74? x ie = 0.242 


P(X = 3) = 2.74? x <— =0.221. 


3! 
i) On how many days in the year would 5 or more deliveries be likely to occur? 


ii) Over the course of one year, what is the greatest number of deliveries 
likely to occur at least once? 

iii) Why might the pattern of deliveries not follow a Poisson distribution? 

ee cenececcesccesvessocoecesescensecuesceesneneeceseuseetouevesevecescsenseocenseosescsscsesesssocecessocscad 

i) 52=365xP(X25). 

ii) 8 - the largest value for which the probability is greater than —. 


= 
iii) If deliveries were not random throughout the 24 hours, 


e.g. if a lot of women had labour induced or had elective caesareans done during the day. 


Did you know? 
An elective caesarean is planned in advance for some births which are 
expected to be difficult. 


Modelling with the Poisson distribution 


In this real-life example, deliveries in 
fact followed the Poisson distribution 
very closely, and the hospital was able to 
predict the workload accurately. 


The conditions for the Poisson distribution are that 

i) events occur at random 

ii) events occur independently of one another 

iii) the average rate of occurrences remains constant 

iv) there is zero probability of simultaneous occurrences. 


As with the distributions you met in $1, the Poisson can be a useful model for a situation 
even when these conditions are not met perfectly. 


Be careful: 

Some change in the underlying conditions may alter the nature of the distribution, 
e.g. traffic observed close to a junction, or where there are lane restrictions and traffic 
is funnelled into a queue travelling at constant speed. 


The Poisson distribution 


The underlying conditions may be distorted by interference from 

other effects, e.g. if a birthday or Christmas occurs during the period 
considered then the Poisson conditions would not be reasonable for the 
arrival of letters by post, for instance. 


Randomness or independence may be lost due to a difference in the average rate 
of occurrences, e.g. the rate of traffic accidents occurring would be expected to vary 
somewhat as road conditions vary. 


Example 11 
The number of cyclists passing a remote village post-office during the day can be modelled as a 
Poisson random variable. On average two cyclists pass by in an hour. 
a) What is the probability that 
i) no cyclist passes 
ii) more than three cyclists pass by between 10 and 11am? 


b) What is the probability that exactly one passes by while the shop-keeper is on a 20-minute 
tea-break? 

c) What is the probability that more than three cyclists pass by in an hour exactly once in a 
6-hour period? 
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a) Inan hour (parts i and ii) A= 2. 
i) P(X =0)=0.1353. 
ii) P(X>3)=1-P(X<3) =1-0.8571 = 0.1429. 


In a 20-minute period G of an hour), the mean number of cyclists will be 2 x i 25 


2/5)! 
(2 
P(exactly one) = TA = 0.342 (3s.f.). 


The situation is that of a binomial distribution — there are 6 ‘trials, 
the number of cyclists in each hour is independent of the other 
periods, and the probability of more than 3 in an hour remains 

the same for all the 6-hour periods, i.e. if Y = number of times 

that more than 3 cyclists pass by in an hour exactly once ina 
6-hour period Y ~ B(6, 0.1429) (using the probability calculated 

in part a) ii). 

P(Y = 1) =6 x 0.1429! x (1 - 0.1429)* = 0.397 (3s.f.). 


Example 12 


At a certain harbour the number of boats arriving in a 15-minute period can be modelled by a 
Poisson distribution with parameter 1.5. 


a) Find the probability that exactly six boats will arrive in a period of an hour. 


b) Given that exactly six boats arrive in a period of an hour, find the conditional probability that 
twice as many arrive in the second half hour as arrive in the first half hour. 


a) Inan hour the average number of boats arriving is 6, so 


yi 
P(6 boats arrive in an hour) = s+ = 0.161. 


b) If twice as many arrive in the second half hour, then there needs to be 2 in a half-hour period 
and then 4 in the next half hour, so 


P(2 boats arrive in half hour, then 4 boats in next half hour) 


=£ 3 x23 _ 0,274 x 0.168 = 0.0376. 


21 4! 
Then the conditional probability is 


P(2 then 4 in half hour | 6 boats arrive in an hour) = mee = 0.234. 
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Exercise 1.5 
1. For the following random variables state whether they can be modelled 
by a Poisson distribution. 
If they can, give the value of the parameter A; if they cannot then explain why. 
a) The average number of cars per minute passing a point on a road is 12. 
‘The traffic is flowing freely. 
X = number of cars which pass in a 15 second period. 
b) The average number of cars per minute passing a point on a road is 14. 
‘There are roadworks blocking one lane of the road. 
X = number of cars which pass in a 30 second period. 
c) Amelie normally gets letters at an average rate of 1.5 per day. 
X = number of letters Amelie gets on December 22nd. 


d) A petrol station which stays open all the time gets an average of 
832 customers in a 24 hour time period. 


X = number of customers in a quarter of an hour at the petrol station. 
e) An A&E department in a hospital treats 32 patients an hour on average. 
X = number of patients treated between 5 pm and 7 pm on a Friday evening. 
2. For the following situations state what assumptions are needed if a 


Poisson distribution is to be used to model them, and give the value 
of A that would be used. 


You are not expected to do any calculations! 

a) On average defects in a roll of cloth occur at a rate of 0.2 per metre. 
How many defects are there in a roll which is 8 m long? 

b) On average defects in a roll of cloth occur once in 2 metres. 
How many defects are there in a roll which is 8 m long? 

c) Asmall shop averages 8 customers per hour. 
How many customers does it have in 20 minutes? 

3. An explorer thinks that the number of mosquito bites he gets when he is 
in the jungle will follow a Poisson distribution. 


The explorer records the number of mosquito bites he gets in the jungle 
during a number of hour-long periods, and the results are summarised 
in the table. 


Number of bites | 0 1 2, 3 4 5 6. | 27 
Frequency 3 7 9 6 6 3 1 0 


stam Modelling with the Poisson distribution 


a) Calculate the mean and variance of the number of bites the explorer gets 
in an hour in the jungle. 


b) Do you think the Poisson is a good model for the number of bites 
the explorer gets in an hour in the jungle? 
4. The number of emails Serena gets can be modelled by a Poisson distribution 
with a mean rate of 1.5 per hour. 
a) i) Whatis the probability that Serena gets no emails between 4 pm and 5 pm? 


ii) What is the probability that Serena gets more than 2 emails 
between 4 pm and 5 pm? 


iii) What is the probability that Serena gets one email between 6 pm and 6.20 pm? 


b) What is the probability that Serena gets more than 2 emails in 
an hour exactly twice in a 5-hour period? 


c) Would it be sensible to use the Poisson distribution to find the 
probability that Serena gets no emails between 4 am and 5 am? 


5. The number of lightning strikes in the neighbourhood of a campsite in a 
week can be modelled by a Poisson distribution with parameter 1.5. 


a) Find the probability that there is exactly one lightning strike in the 
neighbourhood in a given week. 


b) Alejandra spends three weeks at the campsite. Find the probability that there 
are exactly three lightning strikes in the neighbourhood during her holiday. 


c) Given that the neighbourhood has exactly three lightning strikes during her 
holiday, find the conditional probability that each week has exactly one strike. 


Summary exercise 1 


1. IfX ~ Po(1.45) find d) Find 
a) P(X=2) i) P(X <p) ii) P(\X- | <0) 
b) P(X<2) where yl is the mean and ois the 
©) P(X>2) standard deviation of X. 


3. If X ~ Po(6.4) 
2 TEX Bots) . a) find i) P(X<3) ii) P(X=6). 
a) find i) P(X=0) ii) P(X=1) b) For X, write down the i) mean 


iii) P(X > 2). ii) variance iii) standard deviation. 
b) For X, writedown the i) mean c) Write down the mode of X. 
ii) variance iii) standard deviation. d) Find i) P(X<p-o) 
c) Explain why the mode of X is 3. ii) P(X — | < 0), where wis the mean 


and ois the standard deviation of X. 
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4. The number of telephone calls arriving at 


Sharma's home in a 15-minute period may 
be modelled by a Poisson distribution with 
parameter 0.4. Find the probability that in an 
hour 

a) exactly 2 calls are received 

b) more than 2 calls are received. 


The number of accidents which occur on a 

particular stretch of road in a day may be 

modelled by a Poisson distribution with 

parameter 1.6. Find the probability that on 

a particular day 

a) exactly 2 accidents occur on that stretch 
of road 


b) less than 2 accidents occur. 


. X is the number of telephone calls arriving 
at an office switchboard in a 10-minute 
period. X may be modelled by a Poisson 
distribution with parameter 4. 

a) Find the mean and standard deviation 
of X. 

b) Find P(X > y1), where fu = E(X). 

¢) Find P(|X - x| < 0), where ois the 
standard deviation of X. 


An urban safety officer thinks that the 
number of traffic accidents in an area will 
follow a Poisson distribution. 

‘The officer records the number of accidents 
in the area each week over a period of several 


months, and the results are summarised in 
the table. 


Number of 


- 0;1/2/3)4/5)6 
accidents 


Frequency |5/1/1/)3|}5|2)1] 0 


a) Calculate the mean and variance of the 
number of accidents in the area in a 
week. 


Summary exercise 1 


b) Do you think the Poisson is a good 
model for the number of accidents in the 
area in a week? 


The number of errors on a page of a book 
can be modelled by a Poisson distribution 
with parameter 0.15. 

a) Find the probability that there is exactly 
1 error on a given page. 

A chapter of the book has 20 pages. 
Find the probability that there are no 
more than 2 errors in the chapter. 


b) 


c) What is the most likely number of errors 
in the chapter? 


EXAM-STYLE QUESTIONS 


9. 


10. 


The number of errors on a page of the first 
proofs of a book can be modelled by a 
Poisson distribution with parameter 0.6. 


a) Find the probability that a page has 
exactly one error on it. 

b) Find the probability that a double page 

spread has exactly two errors on it. 

c) Given that a double page spread 
has exactly two errors on it, find the 
conditional probability that each page 
has exactly one error on it. 


A shop sells spades. The demand for spades 
follows a Poisson distribution with mean 
2.7 per week. 


a) Find the probability that the demand is 
exactly 2 spades in any one week. 

The shop has 4 spades in stock at 

the beginning of a week. Find the 
probability that this will be enough to 
satisfy the demand for spades in that 
week. 


b) 


°) 


Given instead that there are n spades in Cars travelling north on that road pass the 
stock, find, by trial and error, the least same point randomly and independently at 
value of n for which the probability of an average rate of 1 car each minute. 

not being able to satisfy the demand for : b) Findt 


spades in that week is less than 0.1. 


he probability that a total of fewer 
than 4 cars pass that point in a 3-minute 


11. The random variable X has the distribution : period. 
Po(2.5). The random variable Y is defined i 13. The number of lightning strikes at a 
by Y= 2x. : particular place in a 28-day period has a 
a) Find the mean and variance of Y. : Poisson distribution with mean 1.2. 
b) Give a reason why the variable Ydoes —! : o 
not have a Poisson distribution. a) ‘Bindithe probability; thatatmost2 
: lightning strikes will be recorded at that 
12. Cars travelling south on a rural road : place in a 42-day period. 
pass a particular point randomly and : 
independently at an average rate of 2 cars : b) Find, in days, correct to 1 decimal place, 
every three minutes. : the longest time period for which the 
a) Find the probability that exactly 3 cars E probability thatind) lightning ateilces wall 
travel south past that point in a 5-minute ! be recorded at that place is at least 0.9. 
period. 
Chapter summary 
e ‘The Poisson distribution is defined as 
P(X =r)= a for r= 0, 1, 2,3)... 
e ‘The Poisson distribution has a single parameter, A. 
e The Poisson distribution is often written as X ~ Po(A). 
e IfX~ Po(A), then E(X) =A; Var(X) = 0? = A=> st. dev. (o) = V2 
e ‘The conditions for the Poisson are 


i) events occur at random 

ii) events occur independently of one another 

iii) the average rate of occurrences remains constant 

iv) there is zero probability of simultaneous occurrences. 


The Poisson distribution 


Approximations involving 


the Poisson distribution 


The Poisson provides a good approximation 
to binomial distributions where n is large 
under certain conditions. 

For example, the number of genetic 
mutations in a stretch of DNA can be 
modelled well by the Poisson distribution 
- there is a lot of work currently being 
done to understand the processes involved 
in genetic mutations in both the plant 

and animal domains, with the possibility 
of significant medical advances in the 
treatment of diseases like cancer and 
Parkinson’. 


Objectives 

After studying this chapter you should be able to: 

e Use the Poisson distribution as an approximation to the binomial distribution where 
appropriate (1 > 50 and np < 5, approximately). 


e@ Use the normal distribution, with continuity correction, as an approximation to the Poisson 
distribution where appropriate (A > 15, approximately). 


Before you start 
You should know how to: Skills check: 


1. 


Calculate probabilities using the binomial 1. X~ B(40, 0.03). Find P(X < 2). 
distribution, e.g. 


X ~ B(10, 0.3). Find P(X = 2). 
10 
P(X =2) = Jos x 0.7% = 0.233 (3s.f.) 


Calculate probabilities using the normal 2. X~N(20, 20). Find P(X < 17.1). 
distribution, e.g. 


X ~ N(40, 15). Find P(X < 44.2). 


44.2 — 40 


vis 


= 0.861 (3s.f.) 


P(X < 44.2) = oz < = 1.084 


2.1 Poisson as an approximation 


to the binomial 


In the last chapter of S1 you met the use of the normal distribution as 
an approximation to the binomial distribution, provided certain conditions 


were satisfied by the parameters and p. Here we m 
approximation to the binomial. 


If X ~ B(n, p) with n large (n > 50) and p close to 0 (mp <5) then 


X ~ approximately Po(A) with A = np. 


Here are some examples where the binomial and Poi 
have the same mean: 


eet a second 


isson distributions 


Poisson (mean = 4) and 
binomial (n = 10, p = 0.4) 


Poisson (mean = 4) and 
binomial (n= 40, p = 0.1) 


x 


0.25 0.25 —— 

x 02 Poisson x | @ Poisson 

6 binomial]) | ‘6 binomial 

2 0,15 — = 2 0.15 _ ———— 
3 01 3 0.1 
é 0.05 é 0.05 
0 0 

0123 4 5 6 7 8 9 10 11 12 0123 4 5 6 7 8 9 10 11 12 


x 


The mean of the binomial is 4 and the variance 
is 2.4. 
The two sets of probabilities are not particularly 


similar. 


he variance of the binomial is now 3.6 
remember that the variance of the Poisson is 4). 
he agreement between the two sets of 
probabilities is now pretty strong. 


Poisson (mean = 4) and 
binomial (n = 400, p = 0.01) 


@ Poisson 
@ binomial 


Poisson (mean = 4) and 
binomial (n = 4000, p = 0.001) 


@ Poisson 
binomial 


a 
a 


Probability of x 
e 


5 6 7 8 9 10 11 12 
x 


‘These two graphs both seem to show the binomial 


binomial and the Poisson in both cases and the diff 
the differences when n = 400 and p = 0.01. 


and Poisson to be exactly the same — but they are 


not: while you cannot see any difference on this scale graphically, there are differences between the 


erences in the last case are much smaller than 


There is a fundamental difference in that the Poisson outcome space has no 
upper limit whereas the binomial is bounded by the value of n. However, when 


nis large and p is small, the probabilities of high va 


ues of x are very small so 
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this is not a problem (in the same way that the normal can never provide an 
exact model for any physical measurements like heights or weights because the 
distribution cannot take negative values). 


The use of the Poisson as an approximation to the binomial improves as 
increases and as p gets smaller. 


Example 1 
The probability that a component coming off a production line is faulty is 0.01. 
a) Ifa sample of size 5 is taken, find the probability that exactly one of the components is faulty. 


b) What is the probability that a batch of 250 of these components has more than 3 faulty 
feces 


a) If umber of faulty components in sample then X ~ B(5, 0.01) and 
P(X = 1) =5 x 0.01 x 0.994 = 0.0480 (35.f. ) 


b) If Y= number of faulty components in the batch then X ~ B(250,0.01) ~ Po(2.5) 
and P(Y > 3) = 1 - P(Y < 3) = 1 - 0.758 = 0.242 (3s.f.) 


If you are working in a situation where p is close to 1, you can choose to count 
failures instead of successes and still construct an appropriate Poisson approximation. 


Exercise 2.1 


1. The proportion of defective pipes coming off a production line is 0.05. 
A sample of 60 pipes is examined. 


a) Using the exact binomial distribution calculate the probabilities that there are 
i) 0 ii) 1 iii) 2 iv) more than 2 
defectives in the sample. 

b) Using an appropriate approximate distribution calculate the probabilities 
that there are 
i) 0 ii) 1 iii) 2 iv) more than 2 


defectives in the sample. 


2. a) State the conditions under which a Poisson distribution may be used 
to approximate a binomial distribution. 

b) 5% of the times a faulty ATM asks for a personal identification number 
(PIN number) it does not register the number entered correctly. If I enter 
my PIN correctly each time, what is the probability that the ATM will not 
register it correctly in 3 attempts? 

c) Over a period of time, 90 attempts are made to enter a PIN. If all of the 
customers enter their PIN correctly, what is the probability that fewer 
than 3 of the attempts are not registered correctly. 


Poisson as an approximation to the binomial 


3. Ina small town, the football team claim that 95% of the people in town 
support them. If the claim is correct and a survey of 80 randomly chosen 
people asks whether they support the football team, find the probability 
that more than 75 people say they do. 


4. A rare but harmless medical condition affects 1 in 200 people. 


a) Ata cinema-showing which 130 people attend, what is the probability 
that exactly one person has the condition? 

b) Ata concert where the audience is 600, use an appropriate approximate 
distribution to find the probability that there are fewer than 5 people 
with the condition. 


5. The Nutty Fruitcase party claim that 1 in 250 people support their policy to 
distribute free fruit and nut chocolate bars to children taking examinations. 
a) Inan opinion poll which asks 1000 voters about a range of policies put 
forward by different parties, find the probability that 
i) no-one will support the Nutty Fruitcase party policy 
ii) at least 5 people will support the policy. 


b) Ifthe opinion poll had 7 people supporting the policy, does this mean 
that the Nutty Fruitcase party have underestimated the support there 
is for this policy? 
6. A rare medical condition affects 1 in 150 sheep. 
a) Inasmall farm holding with a flock of 180 sheep, what is the probability 
that exactly one sheep has the condition? 
b) A large farm has a flock of 500 sheep. Use an appropriate approximate 


distribution to find the probability that there are fewer than 5 sheep with 
the condition. 


2.2 The normal approximation to the Poisson distribution 


For large A( > 15, approximately) you would often use a normal approximation 
particularly when the probability of an interval is required, e.g. P(X > 15) or 

P(6 < X < 14), since this is a single calculation for a continuous random variable 
but requires multiple calculations for a discrete random variable. 


Remember that the normal uses the standard deviation to calculate the 


z-score, i.e. z= 


You must also include the continuity correction (which you met in S1 
when using the normal to approximate another discrete distribution — 
the binomial). 


The parameters used are the mean and variance of the Poisson, i.e. 1 = 0° = A. 
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| Example 2 


If X ~ Po(16) calculate P(11 < X < 15) 
a) using the exact Poisson probabilities b) by using a normal approximation. 


ls) Pale xX< 15) 2 P(X ~ 11, 12,13, 14,15) 
= (A 4 1ST 41S 4 1S 4 1) = 0.389 
ll! 12! 13! 14! 15! 
b) For the Poisson, u = A= 16; 0? =A = 16, 


so use the N(16, 16) distribution to approximate the 
P(16) distribution. 
The continuity correction says P(11 < X < 15) a P(LO5 <¥ <15.5) 
where Y is the approximating normal. 
10.516 | 

SS e732 

vis vie 
= P(-1.375 <Z<-0.125) 
= (1.375) - (0.125) = 0.9155 — 0.5498 = 0.366. 


P(10.5 < Y< 15.5) = Hf 


Example 3 

The demand for a particular spare part in a car accessory 

shop may be modelled by a Poisson distribution. 

On average the demand per week for that part is 2.5. 

a) The shop has 4 in stock at the start of one week. 
What is the probability that they will not be able 
to supply everyone who asks for that part during 
the week. 

b) The manager is going to be away for 6 weeks, and 
wants to leave sufficient stock that there is no more than a 5% probability of running 
out of any parts while he is away. How many of this particular spare part should 
he have in stock when he leaves? 


a) For the demand in a week, use the Po(2.5) distribution. Then if the demand is 4 or less 
the shop can supply all the customers. 
P(X < 4) = 0.0821 + 0.2052 + 0.2565 + 0.2138 + 0.1336 = 0.8912. 
The probability of not being able to supply all the demand is 1 — 0.891 = 0.109 (3s.f.). 
b) For the demand in 6 weeks, use the Po(15) distribution, which can be approximated 
by the N(15,15) distribution. 
You need to find k so that P(demand < k) > 0.95. 
@" (0.95) = 1.6449 so you need to find the smallest integer k which satisfies 


(k+0.5)—15 
VI5 


>1.6449, which is 21 (solution is k > 20.9). 


pZ im The normal approximation to the Poisson distribution 


Exercise 2.2 


1. 


Let X ~ Po(A)and Y ~ N(A, A) where A satisfies the conditions needed for 

Y to be used as an approximation for X. 

Write down the probability you need to calculate for Y (including the 

continuity correction) as the approximation for each of the following 

probabilities for X. 

a) P(X< 16) b) P(X > 22) ©) P(X<17) d) P(45<X<62) 
Which of the following could reasonably be approximated by a normal distribution? 

(for those which can, state the normal distribution that would be used). 

a) X~ Po(16) b) X~ Po(12.32) ©) X~Po(8.5) 


Use normal approximations to calculate 
a) P(X < 42) if X ~ Po(49) b) P(X 29) if X ~ Po(17.5) c) P(X2 13) if X ~ Po(18.4) 
d) P(25<X< 37) ifX~ Po(21) e) P(43 <X<55) if X ~ Po(51.34) 


For X ~ Po(20) 
a) calculate P(17 < X < 23) 

i) directly ii) using a normal approximation. 
b) i) What error is there in using the normal approximation? 

ii) Express this as a percentage of the exact probability. 


Explain briefly the conditions under which the normal distribution can be 
used as an approximation to a Poisson distribution. 


If the conditions are satisfied, state what normal distribution would be used 
to approximate the X ~ Po(A) distribution. 

The number of accidents which occur on a particular stretch of road in a day 
may be modelled by a Poisson distribution with parameter 0.5. 


a) Find the probability that during a week (7 days) exactly 2 accidents occur 
on that stretch of road. 
b) Find the probability that during a month (30 days) at least 20 accidents 


occur on that stretch of road. 


The number of letters delivered to a house on a day may be modelled by 

a Poisson distribution with parameter 1.5. 

a) Find the probability that there are 2 letters delivered on a particular day. 

b) The home owner is away for 3 weeks (18 days of post). Find the probability 
that there will be more than 25 letters waiting for him when he gets back. 


The number of errors on a page of a book can be modelled by a Poisson 
distribution with parameter 0.15. The book has 167 pages. Find the probability 
that there are no more than 20 errors in the book. 
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Summary exercise 2 


1. 


The proportion of defective pipes coming off 
a production line is 0.025. A random sample 
of 60 pipes are examined. 
a) Using the exact binomial distribution 
calculate the probabilities that there are 
i) 0 ii) 1 iii) 2 
iv) more than 2 defectives in the sample. 
b) Using an appropriate approximate 
distribution calculate the probabilities 
that there are 
i) 0 ii) 1 iii) 2 
iv) > 2 defective pipes in the sample. 


2. A rare disease affects 1 in 2000 people on 


average. 

a) Use a suitable approximation to find the 
probability that, of a random sample of 
7500 people in a city, more than 3 people 
have the disease. 


b) Inarandom sample of n people, the 
probability that no one has the disease 
is less than 0.01. Find the least possible 
value of n. 


Customers arrive at the exchange and 
refunds desk in a store at a constant average 
rate of 1 every 2 minutes. 

a) State one condition for the number of 
customers arriving in a given period to 
be modelled by a Poisson distribution. 

Assume now that a Poisson distribution is a 

suitable model. 

b) Find the probability that exactly 4 
customers will arrive during a randomly 
chosen 10-minute period. 

c) Find the probability that less than 3 
customers will arrive during a randomly 
chosen 5-minute period. 


Summary exercise 2 


d) Find an estimate that fewer than 25 
customers will arrive in a randomly 
chosen hour-long period. 


EXAM-STYLE QUESTIONS 
4. A random variable X has the distribution 


Po(2.5). 
a) A random value of X is found. 
i) Find P(X>2). 
ii) Find the probability that X = 2 given 
that X 2 2. 


b) Random samples of 150 values of X are 
taken. 


i) Describe fully the distribution of the 
sample mean. 

ii) Find the probability that the mean of 
a random sample of size 150 is less 
than 2.4. 


On average 3 people in every 10000 in 
Canada have a particular gene. A random 
sample of 4000 people in Canada is chosen. 
The random variable X denotes the number 

of people in the sample who have the gene. 
Use an approximating distribution to calculate 
the probability that there will be more than 

2 people in the sample who have the gene. 


2% of bottles on a production line do not 
have their tops securely fastened. This fault 
occurs randomly. 200 bottles are checked to 
see whether the tops are securely fastened. 
Use a suitable approximation to find the 
probability that fewer than 4 do not have the 
top securely fastened. 


8. A manufacturer packs computer 
components in boxes of 500. On average, 
1 in 2000 components is faulty. Use a suitable 
approximation to estimate the probability 
that a randomly chosen box contains at least 
one faulty component. 


A dissertation contains 5480 words. For each 
word, the probability it contains an error is 
0.001, and these errors can be assumed to 
occur independently. The number of words 
with errors in the dissertation is represented 
by the random variable X. 


a) State the exact distribution of X, 


9. On average 1 in 3000 adults has a certain 
including the value of any parameters. 


medical condition. 

b) State an approximate distribution for X, 
including any parameters, and justify the 
use of this approximation. 


a) Use a suitable approximation to find the 
probability that, in a random sample 
of 4500 people, fewer than 4 have this 
c) Use this approximate distribution to find condition. 
the probability that that there are more 
than 4 words printed wrongly in the 


dissertation. 


b) Inarandom sample of n people, where 
nis large, the probability that none has 
the condition is less than 10%. Find the 
smallest possible value of n. 


Chapter summary 

e@ IfX~ B(n, p) with n large (n > 50) and p close to 0 (mp < 5) then X ~ approximately Po(A) 
with A = np. 
If X ~ Po(A) with A > 15 (approximately) then X ~ approximately N(A, A). 


When the Poisson is approximated by a normal distribution, a continuity correction must 
be used. 


Approximations involving the Poisson distribution 


Linear combination of random variables 


The real world is not simple; many 
things are made up of more than 

one component. It is often easier to 
model each component of a process 
separately than it is to try to produce a 
complex model of the whole process. 
Simulations then allow you to get 

a good idea of what the behaviour 

of the overall process would be. For 
example, simulating the number of 
passengers on a flight, and then the 
baggage and person weights would be 
easier to do separately. 


Objectives 
After studying this chapter you should be able to: 


e Use, in the course of solving problems, the results that 
o  E(aX + b) = aE(X) + b and Var(aX + b) = a’Var(X) 
© (aX + bY) = aE(X) + bE(Y) 
o  Var(aX + bY) = a°Var(X) + b’Var(Y) for independent X and Y. 


Before you start 
You should know how to: Skills check: 


1. Calculate the mean and variance of a 1. Calculate E(X) and Var(X). 
random variable, e.g. 


x alas 
[i oe Be P(X=x)| 02/04] 04 


P(X=x) | 0.1 | 0.6 | 0.3 


Calculate E(X) and Var(X). 


E(X) = (3 x 0.1) + (4x 0.6) + (5 x 0.3) = 4.2 

E(X2) = (9 x 0.1) + (16 x 0.6) + (25 x 0.3) 
=18 

Var(X) = 18 - 4.2? = 0.36 


3.1 Expectation and variance of a linear function 
of a random variable 


In S1 Sections 5.3 and 5.4 you met the expectation and variance of 
a discrete random variable: 


e@ The mean or expected value of a probability distribution is defined 
as [t= E(X) = > px. 

e The variance of a probability distribution is defined as 
Var(X) = E[{X - E(X)}’]. The alternative version (which is easier to 
use in practice) is Var(X) = E(X?) - {E(X)P. 


In S1 Section 2.5 you saw: 
If a set of data values X is related to a set of values Y so that Y = aX + b, then 


e mean of Y= ax mean of X+b 
e standard deviation of Y = a x standard deviation of X 
e variance of Y = a’ x variance of X. 


he same relationship applies if X and Y are random variables defined 
n the same way (Y = aX + b). 


he proof of these results is easiest to do by considering the multiplication 
by a constant and adding a constant separately, and then the full result is 
obtained just by applying them one after the other. We will show it in full 
here for discrete random variables, but the same result holds for 
continuous random variables which you will meet in Chapter 5 

(where the summation is replaced by integration). 


If Y=aX 
He = Uap = by = Do yp = Dax) p= ay) xp = apy 
E(X*) = )°x*p = EY?) = Dy’ p= Yi(ax) p =a" x’ = E(X”) 
Var(X) = E(X*) ~ (utx)"s Var(¥) = E(Y*) = (44)” = a E(X*) — (apt? 
= a {E(X") - (ux) =a’ Var(X). 
IfY=X+b 
My => xp > by =Dyp = D(x +b) p= doxp + bY p = uy + b(since )° p = 1). 


Linear combination of random variables ex) 


It is now easier to use the other form for the variance, i.e. Var(X) = E(X - u,)? 
to derive the variance of Y: 


Var(¥) = E(Y — pu, ¥ = E((X+ b) — (44 + B))? = E(X ~ fay)? = Var(X). 


For any random variable X: 
E(aX+ b)= aE(X)+ b 
Var(aX+ b)= a’Var(X) 


where a and bare constants. 


Note that a and/or b can be negative and the algebra follows the same 
way — the square in the expression for variance ensures that the variance 
will always remain positive (if it is not zero). 


Example 1 

Given that E(X) = 7 and Var(X) = 3, find 

a) i) E(2X+5) ii) Var(2X +5) 
ii) Var(5 - 2X). 


E(2X + 5) = 2E(X) +5 =144+5=19 
Var(2X + 5) = 2? Var(X) =4x3=12 
E(5— 2X) =5 -2E(X)=5-14=-9 
Var(5 - 2X) = (-2)? Var(X) =4x 3 = 12 


Example 2 

It costs $35 to hire a car for the day, and there is a mileage charge of 20 cents per mile. 

‘The distance travelled in a day has expectation 150 miles and standard deviation of 18 miles. 
Find the expectation and standard deviation of the cost per day. 


Let X = number of miles travelled, and 

Y = cost of hire in $, then Y = 0.2X + 35 

E(X) = 150; Var(X) = 182 

E(Y) = 0.2 x 150 + 35 = 65; Var(Y) = 0.2? x 18? = (3.6)? 
So the expectation of the cost per day is $65, and the standard deviation is $3.60. 


expressing the cost as a linear function of 
the mileage 


Expectation and variance of a linear function of a random variable 


Exercise 3.1 


1. Given that E(X) = 5.7, Var(X) = 1.9 for each of the following functions, 
write down the mean and variance. 


a) 2X+7 b) 4-3X co) X+3 d) 7X 


2a) [y 1 2 3 4 5 


P(x=x)| 01 | 02 | 03 | 03 | O01 


i) Calculate E(X) and Var(X). 


ii) Calculate the mean and variance of 5X - 2. 


») I, 2 [aio L 2 


P(Y=y) 0.1 0.2 0.3 0.3 0.1 


i) Calculate E(Y) and Var(Y). 
ii) Calculate the mean and variance of 5Y - 2 
iii) Y = X - 3: show that E(Y) = E(X) - 3 and Var(Y) = Var(X). 


oD fy [7 ]s fo fw 
uf 1 1 uh 1 
MVE) || 3 | 4 8 16 | 16 
i) Calculate E(V) and Var(V). 
ii) Calculate the mean and variance of 4 + 3V. 
4) Iw 5 | =i, | 0 1 7 
1 i 1 1 
PWS)! 4 8 16 | 16 
i) Calculate E(W) and Var(W). 
ii) Calculate the mean and variance of 4+ 3W 
iii) W = V - 9: show that E(W) = E(V) - 9 and Var(W) = Var(V). 
e x -2 -1 0 i 2 


P(x=x)| 01 | 04 | 02 | 02 | 01 
a) Calculate E(X) and Var(X). 


b) Calculate the mean and variance of 7 - 2X. 


4. A supermarket sells top up vouchers valued at $5, $10, $15 or $20. 
The value, in dollars, of a top up voucher sold may be regarded as 
a random variable X with the following probability distribution: 
x 5 10 15 20 
P(X=x) 0.20 | 0.40 | 0.15 | 0.25 


Linear combination of random variables 


a) Find the mean and variance of X. 

b) What is the probability that the next top up voucher sold is less than $15? 

c) Asa promotion the marketing department decides the supermarket will offer 10% off the 
cost of each top up voucher. Write down the mean and variance of the value of vouchers sold 
in the promotion, assuming customers continue to use the same pattern. 

The marketing department had originally planned to offer $1 off each voucher. 

d) i) What would have been the mean and variance of the value of vouchers sold, assuming 

customers continued to use the same pattern. 
ii) An experienced manager had told the supermarket that they would only sell $5 top up 
vouchers if they used this strategy. Explain why he thought this would happen. 


5. A discrete random variable X has the probability function shown in the table below. 


x 6 7 8 9 
P(X =x) 0.1 0.2 0.3 0.4 


Find 
a) P(X=8) b) E(X) ©) Var(X) 
d) E(5-2X) e) Var(5 - 2X). 


6. The random variable X has the following probability distribution. 


x Z 8 9 10 11 
P(X=x) | 0.2 a 03 | O41 b 
a) Given that E(X) = 9.05, write down two equations involving a and b. 
Find 
b) the value of a and the value of b 
c) Var(X) 
d) Var(5 - 3X). 
7. The random variable X has probability function 


P(X =x) = G28) x=1,2,3,4,5. 
a) Construct a table giving the probability distribution of X. 


Find 
b) P(2<X<5) c) E(X) 
d) Var(X) e) Var(2 - 3X). 


ky Expectation and variance of a linear function of a random variable 


8. The random variable X has the following probability distribution. 


x 10 12 15 16 18 20 24 25 


P(X< x) 0.1 0.2 0.05 | 0.05 0.2 0.1 0.2 0.1 


Find 
a) E(X) 
b) Var(X) 
c) Var(20 - 3X). 
9. An independent financial advisor recommends investments in 7 traded 


options. The number X from which the client makes a profit can be 
modelled by the discrete random variable with probability function 


P(X = x) = kx x= 0, 1, 2, 3, 4, 5, 6, 7, where k is a constant. 

a) Find the value of k. 

b) Find E(X) and Var(X). 

‘The total cost of the investment is $4000 and the return on each successful 
option is $1500. 

c) Find the probability that the client makes a loss overall. 


d) Find the mean and variance of the profit the client makes. 
10. A test is taken by a large number of members of the public. There are 


5 questions, and the probability distribution of X, the number of correct 
answers given, is: 


x 0 1 2 3 4 5 


P(X=x) | 0.05 | 0.10 | 0.25 | 0.30 | 0.20 | 0.10 


a) Find the mean and standard deviation of X. 
A mark, Y, is given where Y = 10X + 5. 


b) Calculate the mean and standard deviation of Y. 


Linear combination of random variables 


3.2 Linear combination of two (or more) independent random 
variables 


If X, Y are independent random variables then 

E(aX + bY) = aE(X) + bE(Y) 

Var(aX + bY) = a’Var(X) + b’Var(Y) 

A formal proof of this is beyond the scope of this course (you would need 


to learn the theory of joint probability distributions to do it), but we can 
see examples of it in particular cases. 


In fact, if we can show 


E(X + Y) = E(X) + E(Y) 
Var(X + Y) = Var(X) + Var(Y) 


then the general result will follow immediately using the results in the 
ast section. 
Consider the score on a fair die — it has probability distribution given by 


x 


P 


alin |e 
ale |v 
alH |v 
al | oa 


Then E(X) = 3.5 and Var(X) = 2. In S1 Chapter 5 you met the distribution 


of the sum of the scores on two fair dice. 


f S = sum of the scores on the two dice, the sample space for S will be: 


Linear combination of two (or more) independent random variables 


Then the probability distribution for S is 


s 2 | 5) 4 5 6 1 8 9 10 | 11 | 12 
3 4 5 6 5 4 3 2 1 
P(S=s) 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 


and the way the sample space has been coloured shows the pattern leading 
to the symmetrical triangular distribution. 

‘Then E(S) = 7; Var(S) = 33 

So E(S) = E(X) + E(Y); Var(S) = Var(X) + Var(Y) 

If you now let D = difference between the two dice, i.e. D = X — Y where 

X and Y are the scores on two fair dice, you can see what happens. In the table 
below X is the row value and Y the column value. 


1[/2[3]4]5 
5 4 2 1 
36 36 36 36 36 


E(D) =0=E(X) - E(Y); Var(D) = Var(S) = Var(X) + Var(Y) 


Linear combination of random variables 


If we draw a graph showing the probability distributions you can see why 
the variance of the sum = the variance of the difference. 


Probability distribution for the sum Probability distribution for the 
of scores on two fair dice difference of scores on two fair dice 
0.2 0.2 
0.18 0.18 
0.16 0.16 
0.14 0.14 
2 0.12 £0.12 
3 0.1 3 0.1 
& 0.08 & 0.08 
0.06 0.06 
0.04 0.04 
0,02 0.02 
0 0 
234567 8 9 101112 5-4-3-21012345 
Sum of scores Difference of scores 
Example 3 


If X is a random variable with E(X) = 25, Var(X) = 8, and Y isa random 
variable with E(Y) = 15, Var(Y) = 6, and X and Y are independent, find: 


i) E(X+ Y) and Var(X + Y) 
ii) E(X - Y) and Var(X - Y). 


i) E(X+ Y)=E(X)+E(Y)=25+ 15 =40 
Var(X + Y) = Var(X) + Var(Y) =8+6=14 

ii) E(X- Y) =E(X) -E(Y) =25-15=10 
Var(X - Y) = Var(X) + Var(Y) =8+6=14 


Example 4 


A fair coin is tossed and a fair die is thrown. Y is the score showing on 

the die, and X = 0 if the coin lands heads and X = 1 if the coin lands tails. 

i) Find the expectation and variance of each of X and Y. 

ii) Find the probability distribution of X + Y. 

iii) Use the probability distribution in ii) to find the expectation and 
variance of X + Y. 

iv) Show that E(X + Y) = E(X) + E(Y) and Var(X + Y) = Var(X) + Var(Y). 


Linear combination of two (or more) independent random variables 


Xand Y are both equally likely distributions, given explicitly below: 


x y 2 4 


P(X=x) P(Y=y) 


i i 
6 6 


sill Shiver il haan 
so (xX) =40+1) =4, £07) =1041)=1 = var(x)=4 (4) =1 


and E(Y)=10+24+34+4+5+6)=2; BY’) =40 4449416 + 25 +36) = 24 
6 2 6 6 


=> Var(Y) = a -( ) 


ii) Let Z=X+Y: 


Zz 


P(Z =z) io 


iii) E(Z)=2Q+3+44+5+6) +5) qita= 4 


Cee oire et ern ei 


Ss 
=> Var(Z) = a 
iv) E(X)+E(Y)=1+ 
Var(X) + Var(Y) =1 4 35 = 2 =Var(X + Y) 
Exercise 3.2 
1. If X isa random variable with E(X) = 6, Var(X) = 5, and Yis a random 


variable with E(Y) = 10, Var(Y) = 2, and X and Y are independent, find: 
i) E(X+ Y) and Var(X + Y) 
ii) E(X—- Y) and Var(X - Y). 
If X isa random variable with E(X) = 15, Var(X) = 4, and Y is a random 
variable with E(Y) = 15, Var(Y) = 2, and X and Y are independent, find: 
i) E(X+ Y) and Var(X+ Y) 
ii) E(X - Y) and Var(X - Y). 
If X is a random variable with E(X) = 7, Var(X) = 1.5, and Y is a random 
variable with E(Y) = —4, Var(Y) = 2, and X and Y are independent, find: 
i) E(X+Y) and Var(X+ Y) 

ii) E(X- Y) and Var(X - Y). 


Linear combination of random variables 


4. An examination consists of a written paper and an oral test. The 
written paper marks (A) have mean 68.6 and standard deviation 14.7. 
‘The oral test marks (B) are independent of the written paper marks 
and have mean 75.3 and standard deviation 6.2. The overall mark 
for the examination is found by adding 75% of A to 25% of B. 

Find the mean and standard deviation of the overall mark for the 
examination. 


5. Camford has two branches of a bank. The number of customers on a 
Tuesday in the Middle Street branch has mean 45 and standard 
deviation 9. The number of customers on a Tuesday in the King’s 
Street branch has mean 60 and standard deviation 12. 

i) Assuming the number of customers in the two branches are 
independent of one another, find the mean and standard deviation 
of the total number of customers using the two branches on a 
Tuesday. 


ii) Comment on whether the assumption made in part i) is reasonable. 
6. A fair coin is tossed and C = number of heads seen. A fair die is thrown 
and D = score seen on the die. The random variable X is defined as 
X=D-2C. 
i) Find the mean and variance of C. 
ii) Given that E(D) = 3.5 and Var(D) = 2, find the mean and variance of X. 


3.3 Expectation and variance of a sum of repeated 
independent observations of a random variable, 
and the mean of those observations 


Applying the results in the last two sections gives some results that are 
really important to the ideas of estimation and hypothesis testing you will 
meet in later chapters. 


If X,, X,, X, ..., X, are independent observations of a random variable X 
with E(X) = u, Var(X) = o? then: 


(Sx } =n, var{ Sa, } =no? and 


kia Expectation and variance of a sum of repeated independent observations of a random variable 


The last of these four results is very important in statistics because it tells 
us that the sample mean is less variable than the individual observations, 
and that the variance decreases as the sample size increases. The sample 
mean becomes increasingly likely to be close to the population mean as the 
sample size increases. 


Example 5 

X isa random variable with E(X) = 24, Var(X) = 8. Find the expectation and variance of 
5 = 2 

) DX, ii) X= 50%, 
ial 1 


i) Y= x, => E(Y) =5 x E(X) = 120; Var(Y) = 5 x Var(X) = 40 


ii) E(X,,) = B(X) = 24; Var(X,,) = YHOO = & 


1 


Example 6 
X is a random variable with E(X) = 17.2, Var(X) = 45. 


Find the smallest sample size for which the standard deviation of the 
sample mean will be not greater than 2. 


Var(X,) = = <2? => 4n> 45 => 211.25 so the smallest sample size to 


meet this criterion is 12. 


Exercise 3.3 


1. Xisarandom variable with E(X) = 15, Var(X) = 4. Find the expectation 
and variance of 


i) yx i) X,=1>x, 


2. Xisarandom variable with E(X) = 0, Var(X) = 20. Find the expectation 
and variance of 


Linear combination of random variables 


3. Xisa random variable with E(X) = -5, Var(X) = 25. Find the expectation 
and variance of 


50 = ¢a 
i) > x, ii) X= UX 
a : 


i= 


4, Xisa random variable with E(X) = 35.2, Var(X) = 26.1. Find the expectation 
and variance of 


i - 42 
i) YX, ii) X= TUX, 


5. Xisa random variable with E(X) = -5.2, Var(X) = 21. 


Find the smallest sample size for which the standard deviation of the 
sample mean will be not greater than 2. 


6. Xisarandom variable with E(X) = 522.1, Var(X) = 39.2. Find the smallest 
sample size for which the standard deviation of the sample mean will be not 
greater than 1. 


3.4 Comparing the sum of repeated independent observations 
with the multiple of a single observation 


‘There is no easy way to derive the probability distribution of the sum of 
a number of dice thrown together, but it is relatively easy to simulate the 
situation and compare what you see when a large number of repetitions 
are taken. The graphs below show 500 sums of 10 dice, and 500 single 
throws multiplied by 10: 


Sum of 10 dice 


Frequency 
2 
g 


10 14 18 22 26 30 34 38 42 46 50 54 58 
Score 


ZU ~Comparing the sum of repeated independent observations with the multiple of a single observation 


10 x single die 


Frequency 
2 
g 


0 TTT TOT TTT eet 
10 14 18 22 26 30 34 38 42 46 50 54 58 
Score 


The graphs have the axes kept to identical scales to enable the comparison 
to be made. 


Only six outcomes are possible in the second case - where a single 
observation has been multiplied by 10 - and those six outcomes are equally 
likely, although the 500 observations are not exactly evenly spread. 

When the sum of ten independent throws of a die are taken, any (integer) 
score from 10 to 60 could be observed, but to get a total score of 10 you 
would need to see a 1 on each of the ten dice, which would be extremely 
unusual — not impossible, but occurring less than one in sixty million times 
that ten dice are thrown. In contrast, a single 1 occurs on average once 

in six throws, and in the experimental results shown above there were 

80 occasions in 500 trials for which a score of 10 was obtained. 

The outcomes for the sum have a lot more possible values, but those near 
the middle of the possible range have a very high proportion of all the 
observations recorded (474 of the 500 observations were between 25 and 
45 in this case), and the distribution is visibly much less spread out than 
the lower graph showing the single throw multiplied by 10. 


Summary exercise 3 


1. Given that E(X) = 12.1, Var(X) = 2.4, 2. re D B 14 15 | 16 
for each of the following functions, write 
° P(x=x)| 0.1 | 02 | 04 | 02 | 01 
down the mean and variance. 
a) 2X+3 i) Calculate E(X) and Var(X). 


b) 5-3X ii) Calculate the mean and variance of 4X - 3. 


c) X-3 
d) 9X 


Linear combination of random variables #34 


42 


3. The random variable X has the following 
probability distribution. 


x z 8 9 10 | 11 


P(X=x)| 0.2 | a | 03] 01 | b 


a) Given that E(X) = 9.2, write down two 
equations involving a and b. 


Find 

b) the value of a and the value of b 
c) Var(X) 

d) Var(3 - 2X). 

4. The random variable X has probability 
function 
P(X =x) = x=1,2, 3,4. 

a) Construct a table giving the probability 
distribution of X. 

Find 

b) P(2<X<4) 

c) E(X) 

d) Var(X) 

e) Var(2 + 3X). 

5. If X is a random variable with E(X) = 10, 
Var(X) = 3, and Y is a random variable 
with E(Y) = 7, Var(Y) = 3, and X and Y are 
independent, find 
i) E(X+ Y) and Var(X + Y) 

ii) E(X - Y) and Var(X - Y). 


6. An examination consists of a written paper 


(9-2x) 


and an oral test. The written paper marks (A) 


have mean 53.9 and standard deviation 8.9. 
The oral test marks (B) are independent of 


the written paper marks and have mean 64.5 
and standard deviation 6.8. The overall mark 


for the examination is found by adding 50% 
of A to 50% of B. 


Find the mean and standard deviation of the : 


overall mark for the examination. 


Chapter summary 


. The discrete random variable X has 


probability function P(X = x) = kx for 
x = 1, 2, 3, 4, where k is a positive constant. 


a) Show that k= am 

30 
b) Find E(X) and show that E(X?) = 11.8. 
c) Find Var(X). 


d) Find the mean and variance of Y= 17 — 4X. 


. A discrete random variable X has a 


probability function as shown in the table 
below, where a and b are constants. 


x 1 2 4 8 
P(X =x) 0.2 0.3 a b 


a) Given that E(X) = 4.4, find the value of a 
and the value of b. 


b) Find P(l <X <6). 

c) Find E(3X — 4). 

d) Show that Var(X) = 9.24. 
e) Evaluate Var(3X - 4). 


; EXAM-STYLE QUESTIONS 


Exam marks, X, have mean 64.5 and 
standard deviation 12.5. The marks need 
to be scaled using the formula Y = aX + b 
so that the scaled marks, Y, have mean 70 
and standard deviation 10. Find the values 
of a and b. 


. The independent variables X and Y are such 


that X ~ B(8,0.75) and Y ~ Po(7). 

Find 

a) E(3X+4Y-7) 

b) Var(3X-—5Y+3) 

c) P(2X— Y=15). Give your answer in 
standard form correct to 2 significant 
figures. 


Chapter summary 

e For any random variable X and constants a, b: 
E(aX + b) = aE(X) +b 
Var(aX + b) = a’Var(X) 

e IfX, Yare independent random variables then 
E(aX + bY) = aE(X) + b E(Y) 
Var(aX + bY) = a’Var(X) + b’Var(Y) 


e IfX, X,X, ..., X, are independent observations of a random variable X with 
E(X) = ps, Var(X) = o? then: 


3 ie } =n, var x } =no* and 


= ra 
= ~ n, 
H(%)=225x,] ath =p, Var(X)= var( 2 iy, E 1 = x 
e Remember two special cases: 
o IfX, Yare independent random variables then 
E(X - Y) = E(X) - E(Y) 
Var(X — Y) = Var(X) + Var(Y) 
o X,X,X,...,X,are independent observations of a random variable X then: 


[dx li nu = nE(X), var( 3x } no? # Var(nX) = no? 


fal 


Linear combination of random variables as) 


Maths in real-life 


The mathematics of the past 


In Pure 2/3 Chapter 2 you met the the Central Limit Theorem which is one of a 
phenomenon of radioactive decay and number of instances you will meet in statistics 
the behaviour that half of the remaining where the long term behaviour of random 
radioactive atoms in any substance will events is very stable (often known as The Laws 
decay in a period known as the half-life of of Large Numbers) — and the predictability of 
the substance. The rate at which atoms decay the half-life characteristic is an example of a 
follows (roughly) a Poisson process with practical situation in which observed reality 
the average rate in a fixed period reducing matches the theoretical predictions. 


over time. Later in this book you will meet 


In the middle of the last century, Willard Libby 
developed a method of dating organic materials 
discovered in archeological excavations. The method 
relies on the naturally occurring carbon isotope ''C ‘ Nsuiron Gaphine - 
(carbon 14) that is created by the action of cosmic N Be ae B G 
ray neutrons on nitrogen 14 atoms in the upper Lost proton @ 
atmosphere. The 'C atoms quickly oxidise to form 
carbon dioxide in the air. All organic materials (plants, 

animals, humans) assimilate ''C from the atmosphere marie aa 
throughout their lifetimes. However, when they die, 5 E living organisms. 
they stop exchanging carbon with the atmosphere and a 
it then starts to decay at the known rate. 


Cosmic 


coat 


All three isotopes 
of carbon (common 
2¢, rare 3C & 


‘The fact that the proportion of '‘C amongst all carbon 
elements in the atmosphere is known means that 
measuring the proportion existing now allows a dating 


Following death & burial, wood & bones lose 
14¢ as it changes to "4N by beta decay. 


process. Libby’s work is generally regarded as having one 
of the widest impacts of any scientific discovery in the 


1900s and he was awarded the Nobel Prize for Chemistry om Beta decay & mn 
Ic — N 

in 1960; his work revolutionised archaeology and has Bea 

applications in other human and physical sciences. particle @ Proton — @ Neutron 


When the technique was introduced the calculations 
were based on a Geiger counter detecting the number of decaying 


particles, but other more accurate methods were soon developed and 
now the technique is commonly based on using Accelerometer Mass 
Spectrometry which determines the proportions of “C atoms and the other 
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carbon isotopes. Calculations for dates using anit ewes : 8 ay SF BARBY ‘presadd é 
different detection methods for '*C atoms use different nay &Y SHAY yaw wy Vay saihy ry 
mathematics, but all require the use of a base line AQHA S SANG NE. qos eH Oh Laseah 
of what the “C presence has been historically. nigases? nec NEN ng) ASA’ AYI9S ADS! Gg S857 ot 
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The Dead Sea Scrolls are a collection of texts which 
were discovered in caves about 2 kilometres from 

the northwest shore of the Dead Sea. The texts are of 
enormous historical and religious significance and 
carbon dating suggests that the majority of the scrolls 
belong to the last two centuries BC and the first 
century AD. 


Newspapers will tend to sensationalise research 
findings and attribute precision to them which the 
scientists do not claim - headlines might use language 
like ‘Carbon dating pinpoints Mayan calendar’ when 
reporting the analysis of an ancient beam from a 
Guatemalan temple which the scientists concluded 
came from a tree which was cut down and carved 
around AD 658-696. 


‘The process of carbon dating is heavily statistical - 
dates are reported using interval estimates giving 
an idea of both when the artifact was created and 
some idea of the precision with which the date The Step Pyramid of Djoser in Saqqara was shown to be the 
is known (you will meet interval estimates oldest stone pyramid in Egypt by carbon dating techniques. 
formally in Chapter 7 of this book). 
Current developments on the technique 
of carbon dating focus on improving the 
calibration scales used for identifying 
the best estimate for the date (the centre 
of the range reported) and on how to 
improve the reliability of the estimates 

— which would allow narrower ranges 
to be reported. 
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Linear combination of Poisson and 


normal variables 


Manufacturers need to closely control 
production levels — if there is less 
chocolate in a bar or box than the label 
says they can be prosecuted, but if there 
is consistently more than the label says 

it costs the company money. The bars 
shown are made from a single mould, 

so there is one distribution to control, 
but the situation in a box of chocolates is 
that it depends on the sum of a number 
of distributions, so manufacturers need 
to be able to control what happens to the 
sum by knowing how it depends on the 
individual distributions. 


Objectives 
After studying this chapter you should be able to use, in the course of solving problems, 


the results that: 


e If Xhasa normal distribution then so does aX + b. 
e IfXand Y have independent normal distributions then aX + bY has a normal distribution. 


e If Xand Y have independent Poisson distributions then X + Y has a Poisson distribution. 


Before you start 


You should know how to: Skills check: 
1. Calculate probabilities using the Poisson 1. X ~ Po(2.1). Find P(X = 1). 
distribution, e.g. 2. X~N(20, 3.4). Find P(X > 21.1). 


X ~ Po(6). Find P(X = 2). 


P(X=2)= = 0.0446 


2. Calculate probabilities using the normal 
distribution, e.g. 


X ~ N(10, 25). Find P(X < 0). 


0-10 
P(X <0)= oz < =-2| = 0.0228 
25 


4.1 The distribution of the sum of two independent 
Poisson random variables 


In Chapter 3 you saw that the expectation and variance of the sum of any 

two independent random variables are given by the sum of the expectations and 
the sum of the variances of the two random variables, but this tells you nothing 
about the probability distribution of the sum. You also saw the detail of the 
distribution of the sum of two uniform distributions (the scores on two dice) 
which gave a triangular distribution, so you know that it is not generally the 

case that the distribution of the sum will be in a similar form to the original. 


However, there are two special cases where the sum does have the same form. 
‘The first of these is the Poisson distribution: 


‘The sum of two independent Poisson random variables is still a Poisson: 


IfX and Y are two independent Poisson random variables, and X ~ Po(A), 
Y ~ Po(t) then X + Y ~ Po(A + p) 


he full proof of this requires some sophisticated algebraic manipulation, 
in particular using the binomial expansion, and it is noted at the end of this section. 


here are a couple of things you can consider which may help with why 
this distribution might have the special regenerative property. 


e Remember that for a Poisson (A) random variable the mean and variance 
are the same, E(X) = A, Var(X) = o = A, and so the general results of the 
expectation and variance of the sum will mean that, whatever the detailed 
distribution is, as long as it is Poisson, it will have the property that the 
mean and variance will be equal. 


e Remember that changing the length of interval being considered for a 
Poisson distribution simply resulted in a proportional change to the 
parameter to be used ~ so if the number of telephone calls arriving at 
a switchboard in a 10-minute period follows a Poisson distribution and, 
on average, there are 7 calls in that time, then the number of calls ina 
5-minute period would follow a Po(3.5) distribution, and the number 
of calls in a 20-minute period would follow a Po(14) distribution. You 
could think of the 20-minute period as the total in two separate 
10-minute intervals. 


The key steps in the full proof are much easier to see in a concrete example. 
Consider the probability that X + Y takes the value 3 when X ~ Po(2.3) and 
Y ~ Po(3.8) and X, Y are independent. For the sum to be 3, the possible 
combinations for the values of X and Y are 0, 3; 1, 2; 2, 1 and 3, 0 and you 
can calculate that probability: 
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Proof that if X ~ Po(A) and Y ~ Po(u) and X, Y are independent, then X + Y ~ Po(A + uw): 
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Example 1 
Two radioactive substances emit, on average, 3 electrons per second and 
2 electrons per second respectively. 

Calculate the probability that a total of exactly 4 electrons are emitted 
in a given second. 
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If X is the total sored of electrons emitted in a second, then X ~ Po(5). 
And P(X = 4) = = 0.175 (3 s.f.) 


Example 2 

The number of goals a team scores in a league match may be modelled 

by a Poisson distribution with mean 2.2. The number of goals the same 

team concedes in a league match may be modelled by a Poisson 

distribution with mean 1.5. 

a) Assuming these are independent of one another, find the probability that: 
i) the match finishes as a 1-1 draw 
ii) there are two goals in the match. 

b) Comment on the assumption that the number of goals scored and 
conceded by a team are independent of one another. 


a) i) You need both random variables to take the value 1, so 


P(score is 1-1) = ae 2 aa = 0.24377 x 0.334695 = 0.0816 (3 s.f.) 
ii) You could work out the other two possible scores (2—0 and 0-2) and add them 
to the answer in part i), or use the distribution for the total goals is Po(3.7). 
P(two goals) = aoa = 0.169 (3 s.f) 
b) This is unlikely to be true because it is a competitive situation where the difference 
between 3-1 and 2-1 is not the same as the score going to 2—2 (a draw) from 2-1. 


The distribution of the sum of two independent Poisson random variables 


Exercise 4.1 

1. The demand for two magazines in a newsagents forms (independent) Poisson 
distributions. On average the monthly demand for Internet Investigator is 4.3, and 
for the Web Wanderer it is 2.9. Calculate the probability that: 
a) the newsagent is asked for 3 Internet Investigators and 3 Web Wanderers in 

one month 

b) he is asked for 5 magazines altogether in one month 
c) he is asked for 5 magazines in two months of one year. 


2. a) Serious accidents in a certain type of manufacturing industry can be adequately 
modelled by the Poisson distribution with a mean rate 
of 1.6 per week. 
i) What is the probability that there are no serious accidents in a particular week? 
ii) What is the probability that there are at least three serious accidents in a 
three-week period? 
iii) What is the probability that in a four-week period there is exactly one 
week in which there are serious accidents? 

b) Minor accidents in this manufacturing industry can also be adequately modelled 
by the Poisson distribution with a mean rate of 5.4 per week, and these can be 
assumed to occur independently of the serious accidents. 

i) State the distribution for the total number of accidents in the industry. 
ii) What is the probability that there are at least five accidents in a particular week? 


3. Two radioactive substances emit, on average, 3.2 electrons per second and 1.3 electrons 
per second respectively. 

a) Calculate the probability that a total of exactly 3 electrons are emitted in a given 
second. 

b) Calculate the probability that a total of at least 7 electrons are emitted in a 
2-second period. 

4. You are given that X ~ Po(5) and Y ~ Po(3) are independent random variables. Which 
of the following have a Poisson distribution? State the parameter value for any which 
are Poisson variables. 

i) 3X+2 ii) 2X+3Y iii) X+Y iv) X-Y 

5. The number of vehicles passing a point on a motorway heading east has a Poisson 
distribution with mean 24 per minute. The number of vehicles passing the same point 
on the motorway heading west has a Poisson distribution with mean 21 per minute. 
Find the probability that the total number of vehicles passing the point in a 10-second 
period is less than 5. 

6. The number of customers entering a jewellery shop in the first hour that it opens can 
be modelled by a Poisson distribution with mean 2.4, and the number of customers 
entering for the rest of the day can be modelled by a Poisson with mean 3.1 per hour. 
The shop opens at 9.00 am. Find the probability that fewer than 4 customers enter the 
shop between 9.30 am and 12.00 pm. 
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4.2 Linear functions and combinations of normal random 
variables 


In Chapter 3 you saw: 


e the relationships between the expectation and variance of a linear 
function of X and the expectation and variance of X; 

e the expectation and variance of the sum of any two independent random 
variables are given by the sum of the expectations and the sum of the 
variances of the two random variables; 

e the expectation and variance of the difference of any two independent 
random variables are given by the difference of the expectations and 
the sum of the variances of the two random variables. 

The normal distribution is the second instance of where the distribution of the 

sum is in the same form as the original distributions, but it is also true in this case 

for the difference (which it is not the case for the Poisson), and for a linear function 
of a single normal random variable (which again is not the case for the Poisson). 

Using the results from Chapter 3 we can deduce what the parameters will be: 


If X ~ N(x, oy ) and Y ~ N(4s,, o;) are independent then: 


@ aX+b~ Nap +b, ox) 


@ aX+bY ~ Nay, +byy,,a°ox +b’o;) 


‘The full proof of this requires the use of much more sophisticated statistical 
methods (moment generating functions), and unlike the Poisson there are not 
any simple examples you can do to satisfy yourself that it works in particular cases. 


Example 3 

A container is known to have mass 40 grams. The amount of liquid a 
machine dispenses into the container follows a normal distribution 
with mean 200 ml and standard deviation 10ml. 

The liquid has density 0.85kg per litre. 

Find the probability that the filled container weighs less than 200 grams. 


If X = volume of liquid in ml and Y = mass of filled container in 
grams then Y = 40 + 0.85X. 


Given that X ~ N(200, 107) this means 
Y ~ N(40 + 0.85 x 200, 0.85? x 102) > Y ~ N(210, 8.52) 
P(Y < 200) =P {z< 200210 = -1.176| = @(-1.176) 
= 1- (1.176) = 1 — 0.8802 = 0.120 
So there is a 12% chance that the filled container weighs less than 200 grams. 


Linear functions and combinations of normal random variables 


In Example 3 the container was known to have a mass of 40 grams. More 
realistically on a production line like this, the containers would also have 
a distribution which can be modelled by a normal distribution. The next 
example asks the same question as in Example 3, but using a distribution 
for the mass of the container. 


Example 4 

A container has mass which is normally distributed with mean 40 grams 
and standard deviation 3 grams. The amount of liquid a machine dispenses 
into the container follows a normal distribution with mean 200 ml and 
standard deviation 10 ml. 

The liquid has density 0.85 kg per litre. Find the probability that the filled 
container weighs less than 200 grams. 


If X = volume of liquid in ml, Y = mass of empty container 


in grams and W = mass of the filled container in grams then 
W = 0.85X + Y. 


Given that X ~ N(200, 10?) and Y ~ N(40, 3”) this means 
W ~ N(0.85 x 200 + 40, 0.85? x 10? + 3’) => Y ~ N(210, 81.25) 


P=(W<200) =P {z<2paze = 1.109} = ®(-1.109) 
81.25 


= 1 -@(1.109) = 1 — 0.866 = 0.134 


So there is now a 13.4% chance that the filled container weighs less than 200 grams. 


‘There are two important special cases to take particular note of: 


e ‘The mean of n independent normal random variables is still a 


normal distribution with X, ~ (as > a) 


e ‘The difference of two independent normal random variables 
is still a normal distribution, i.e. if X and Y are two independent 
normal random variables, and X ~ N(uy ie, ), Ya N(uy sO; ), 
then X-Y ~N(,—f,,0% +07) 


The first of these results will be considered in much more depth later — in 
Chapter 6 on sampling and Chapter 7 on estimation, and it is the foundation 
for testing the mean of a normal distribution which you will meet in 
Chapter 9. 
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The second result is important when you consider situations in which the 
relative size of two measurements is more important that their actual sizes: 


e Inan engine, a piston moves up and down inside a cylinder — obviously 
the piston diameter has to be smaller than the cylinder for it to fit, but it 
is also important that it is not too much smaller: if it is then the piston 
will wobble around and cause wear on the sides of the cylinder, and also 
not be as efficient as some of the energy can escape past the sides of the 
piston rather than driving the engine. 


Piston 


Cylinder 
bore 


72 


e A similar situation applies to a nut and bolt - and again there will be a 
tolerance for how much smaller the bolt diameter can be than the nut 
diameter — otherwise the threads will not catch and the bolt can slip 
through the nut. 


e@ Which of two athletes wins in a competition? The times or distances recorded 
by an athlete in an event can often be (approximately) modelled by a normal 
distribution. The sign of the difference in their times or distances determines 
who wins. However, the result depends on the independence of the two 
distributions, which is unlikely to be exactly true in a head to head competition 
if the athletes can see where they are relative to the other athlete during the 
competition. It may still be a useful first approximation — in statistics you 


Linear functions and combinations of normal random variables 


are often faced with situations in which the only thing you can do is something 
you know is not a perfect description of the situation — as long as you treat the 
result with caution, knowing its imperfections, it is likely to be better than having 
no information to work with. 


Example 5 

A batch of bolts has diameters which are normally distributed with mean 
17.5mm and standard deviation 0.4mm. The diameters of holes of the 
batch of nuts delivered with the bolts are normally distributed with mean 
18mm and standard deviation 0.3 mm. 


i) Find the probability that a randomly chosen bolt will fit inside a 
randomly chosen nut. 


If the diameter of the bolt is more than 1.1 mm smaller than the diameter 
of the hole then the bolt is not securely held. 


ii) Find the probability that a randomly chosen bolt which fits inside a 
randomly chosen nut will be held securely. 


If X = diameter of the bolt and Y = diameter of the nut then X ~ N(17.5, 0.4”) and 

Y ~ N(18, 0.32). 

i) Y-X~N(0.5, 0.3? + 0.42) = N(0.5, 0.5%) 

For the bolt to fit inside the nut, you need Y - X > 0 
P(Y = X>0) = Piz >9=05 = -1| = 1-@(-1) = ©(1) = 0.8413. 

ii) This is a conditional probability - that it holds securely given that it fitted inside. 
If it fits inside and the bolt is held securely then 0 < Y- X< 1.1 
P(O<Y-X<L1)= pi-i<z<bios -12| 

= @(1.2) — ®(-1) = B(1.2) - (1 - ®(1)) 
= 0.8849 — (1 — 0.8413) = 0.7262 


0.7272 
0.8413 


PIO<Y-X<1.1/Y-X>0)= = 0.863. 
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Exercise 4.2 


1. You are given that X ~ N(5, 7) and Y ~ N(3, 10) are independent random 
variables. Which of the following have a normal distribution? State the 
parameters for any which are normally distributed. 


i) 3X+2 ii) 2X+3Y iii) X+Y iv) X-Y 
2. X ~N(320, 20) and Y ~ N(110, 30) are independent random variables. 

i) State the distribution of 2X + 3Y. 

ii) Find the probability that 2X + 3Y is more than 1000. 


3. X ~ N(40, 4.5) and Y ~ N(62.5, 12.1) are independent random variables. 
i) State the distribution of 5X + 4Y. 
ii) Find the probability that 5X + 4Y is more than 440. 


4. A~N(12,7) and B ~ N(-3, 2) are independent random variables. 
i) State the distribution of 4A - 3B. 
ii) Find the probability that 4A - 3B is more than 55. 

5. The cost of Jarinda’s electricity for a month is a fixed charge of 800 cents 
together with a charge of 5.8 cents per unit of electricity used. The number 


of units she uses in a month is normally distributed with mean 456 units 
and standard deviation 18.2 units. 


i) Find the mean and standard deviation of the total cost of Jarinda’s 
electricity in a randomly chosen month. 


ii) Find the probability that her bill in a randomly chosen month is 
more than 35 dollars. 
6. Wis the mass, in grams, of water in a bottle. B is the mass, in grams, of the bottle. 
W ~ N(752, 8) and B ~ N(30, 1.2) are independent. 
i) Find the probability that a filled bottle has a total mass of less than 775 grams. 
ii) A litre of water has a mass of 1 kilogram. Find the probability that 4 randomly 


chosen bottles contain less than 3 litres of water. 


7. The bottles of water described in question 6 are packed in cardboard boxes 
which contain 24 bottles. The mass C, in grams, of the cardboard box is 
normally distributed with mean 750 grams and standard deviation 6.2 grams. 
a) i) Find the distribution of the mass of a box with 24 filled bottles. 

ii) Find the probability that the full box has a mass of more than 19.5 kg. 
b) A shop takes delivery of 20 full boxes. Find the probability that the 
average mass of the boxes is more than 19.5 kg. 
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Summary exercise 4 


1. a) Minor accidents in a particular factory 


can be adequately modelled by the 

Poisson distribution with a mean rate of 

3.6 per week. 

i) What is the probability that there are 
no minor accidents in a particular 
week? 


What is the probability that there 
are at least five minor accidents in a 
three-week period? 


ii) 


iii) What is the probability that in a 
four-week period there are exactly 
three weeks in which there are minor 
accidents? 

b) Serious accidents in this factory can also 
be adequately modelled by a Poisson 
distribution with a mean rate of 0.3 per 
week, and these can be assumed to occur 
independently of the minor accidents. 


i) State the distribution for the total 
number of accidents in the industry. 

ii) What is the probability that there are 
at least five accidents in a particular 
week? 


The number of vehicles passing a point on 

a rural road heading north has a Poisson 
distribution with mean 3.5 per hour. The 
number of vehicles passing the same point 
on the road heading south has a Poisson 
distribution with mean 3.2 per hour. 

i) Find the probability that no vehicles pass 
the point in a randomly chosen 
15-minute period. 

Find the probability that the total 
number of vehicles passing the point in a 
two-hour period is less than 6. 


ii) 


3; 


The cost of Grigor’s gas for a month is a 
fixed charge of 500 cents together with a 
charge of 6.4 cents per unit of gas used. 
The number of units he uses in a month is 
normally distributed with mean 965 units 
and standard deviation 83.2 units. 


i) Find the mean and standard deviation 
of the total cost of Grigor’s gas in a 
randomly chosen month. 

Find the probability that his bill in a 
randomly chosen month is more than 
65 dollars. 


ii) 


W is the mass, in grams, of water in a 
bottle. B is the mass, in grams, of the bottle. 
W ~ N(501, 5.2) and B ~ N(25, 1.5) are 
independent. 
i) Find the probability that a filled bottle 
has a total mass of less than 525 grams. 
ii) A litre of water has a mass of | kilogram. 
Find the probability that 4 randomly 
chosen bottles contain more than 2 litres 
of water. 


; EXAM-STYLE QUESTIONS 


o 


s 


The volume of water in bottles made by 
Aguafresh is normally distributed with 
mean 751 millilitres and standard deviation 
9.2 millilitres. The volume of water in 
bottles made by Lifespring is normally 
distributed with mean 1002 millilitres and 
standard deviation 12.8 millilitres. Find the 
probability that 4 randomly chosen bottles 
of Aguafresh contain less liquid than 3 
randomly chosen bottles of Lifespring. 

In their hockey matches, the Steelers score 
goals independently and at random times 
with an average of 2.8 goals per match. 
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: EXAM-STYLE QUESTIONS 


EXAM-STYLE QUESTIONS 
i 9. The weights of men in a different country 


i) State the expected number of goals that 


the Steelers will score in the first half 
of a match. 

ii) Find the probability that the Steelers wil 
score twice in the first half of a match 
and not in the second half. 

iii) Given that the Steelers score two goals in 

a match, find the conditional probability 

that both goals were scored in the first half. 


iv) Hockey matches last for 70 minutes. In 
a particular match the Steelers score one 
goal in the first 10 minutes. Find the 
probability that they will score at least 
two goals in the match. 


Independently of the number of goals scored 
by the Steelers, the number of goals scored 

per hockey match by the Giants has a Poisso: 
distribution with mean 1.7. 


v) Find the probability that more than 3 
goals will be scored in a particular mate! 
when the Steelers play the Giants. 


The number of characters on a page of 
a book can be modelled by a norma! 
distribution with mean 1983 and standard 
deviation 44.7. Find the probability that 
the total number of characters in a random 
sample of 8 pages is more than 16 000. 


‘The weights of men follow a normal 
distribution with mean 74 kg and standard 
deviation 7.5 kg. The weights of women 
follow a normal distribution with mean 
60.5 kg and standard deviation 5.4 kg. 


Four randomly selected men sit on one side 
of a large see-saw and five randomly selected * 
women sit on the other side. What is the 
probability that the men’s side is heavier than 
the women’s side? 


Summary exercise 4 


10. 


ll 


follow a normal distribution with mean 78 
kg and standard deviation 7 kg. The weights 
of women follow a normal distribution with 
mean 63 kg and standard deviation 6.1 kg. 


Four randomly selected men sit on one side 
of a large see-saw and five randomly selected 
women sit on the other side. What is the 
probability that the women’s side is heavier 
than the men’s? 

The masses, in grams, of a certain strain of 
tomatoes are normally distributed with mean 
64 and standard deviation 5.3. The tomatoes 
are packed in bags of six, and the tomatoes 
are randomly chosen for each bag. The boxes 
are checked by Quality Control and any bag 
with a total mass of tomatoes of less than 

350 grams is rejected. Find the proportion of 
bags that are rejected. 


. An elevator will normally carry a warning 


sign telling people the safe capacity limit in 
kg. For ease of use it will often also give a 
guideline for the number of people who can 
safely use the elevator together. 

A hotel elevator sign says the safe limit is 

900 kg. It advises that not more than 10 adults 
should travel together. 


Health data in that country suggests that the 
weights of adults are normally distributed 
with a mean of 79.5 kg and standard 
deviation 12.1 kg. 


Find the probability that a randomly selected 
group of 10 adults would exceed the safe 
weight limit for this elevator. 


12. A piston is in the shape of a cylinder. The 


diameters of the pistons are normally 
distributed with a mean of 92.5 mm and a 
standard deviation of 0.4 mm. The pistons 
have to fit inside cylindrical sleeves which 
are normally distributed with a mean of 


93.4 mm and a standard deviation of 0.3 mm. 


a) Find the probability that a randomly 
selected piston will fit inside a randomly 
selected sleeve. 


Chapter summary 


In order to function properly, the diameter 
of the piston needs to be no more than 2 mm 
less than the diameter of the sleeve. 


b) Given that a piston fits inside a sleeve, 
find the probability that it functions 


properly. 


e@ ‘The sum of two independent Poisson random variables is still a Poisson: 
If X ~ Po(A) and Y ~ Po(u) are independent then X + Y ~ Po(A + 1) 
e ‘The sum of two independent normal random variables is still a normal: 


IfX~N(j1,, o;) and Y~N(,, 0; ) are independent then: 


aX +b~N( apt, +b, a’ox) 
aX +bY ~N(apt, +bpt,, ao +b; ) 
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Continuous random variables 


Since any random variable that is a 
measurement of time, length, etc. will 

be a continuous random variable, the 
material in this chapter is fundamental 

to the realistic treatment of much of the 
world around us. The management of risk 
is now very big business - modelling the 
behaviour of tides, earthquakes, floods, etc. 
is important both for insurance companies 
and for governments in determining 
policy such as whether to allow residential 
developments in areas of potential flood 
risk. 


Objectives 

After studying this chapter you should be able to: 

e Understand the concept of a continuous random variable, and recall and use properties of a 
probability density function (restricted to functions defined over a single interval). 


e Use a probability density function to solve problems involving probabilities, and to calculate 
the mean and variance of a distribution (explicit knowledge of the cumulative distribution 
function is not included, but location of the median, for example, in simple cases by direct 
consideration of an area may be required). 


Before you start 
You should know how to: Skills check: 


1. Integrate a variety of functions, and use the fa iff (2- Kade = 1 Filtd thie walue OF, 


results, e.g. | 
. 2. Calculate i (02 — 2x + 5)dx. 
uf sin x dx = 1 find the value of k. - 
fe sin x dx = [-kcosx]# = {—-=] 
= 
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5.1 Introduction to continuous random variables 


The times taken by competitors in the swimming section of a triathlon are 
shown in the following histograms, with smaller and smaller intervals: 


‘Swim times in triathlon Swim times in triathlon 

Ve 0.2 
a Ze 
5 5 
= 5 
201 2 o4 
2 & 
EZ g 

0.0 +++ ane ed 0.0 + = 

5 7 9 111315171921 23 5 15 25 
Time (minutes) Time (minutes) 
‘Swim times in triathlon acacia 

03 = a8 
2 E 
5 02 3 
e g 
3 01 2 0.1 
E 

0.0 0.0 

5 15 25 5 15 25 
Time (minutes) Time (minutes) 


As the number of intervals increases (in the first three graphs), the data show jagged 
edges. This is a representation of a finite number of observations - and if another set 
of observations was made, you would expect to see something broadly similar but 
the detail is unlikely to be exactly the same. The fourth graph looks like a plausible 
description of the distribution of times for the swim section of the triathlon in that 
competition. 


On a different day, if the weather is different, with different competitors, 
or on a different course the distribution might be different. 


These histograms are drawn as frequency density diagrams, so the total 
area of all the bars is always 1. 


The limit of the cases where smaller and smaller intervals are taken, but the 
total area stays at 1, is what is meant by a continuous random variable. 


The probabilities are defined as the area under the curve between two 
values of x. 


Measurements of length, time, mass, area and volume as well as 
compound measures such as speed and density are on a continuous 
scale. While they are often recorded by being grouped in intervals, the 
actual distribution is continuous. 


Continuous random variables 


Exercise 5.1 
1. State which of the following random variables are continuous. 


a) The length of the winning throw in the men’s javelin competition 
at the next Olympic Games. 


b) The number of blades of grass in a 1 square metre area of lawn. 
c) The volume of water in a bottle. 
d) The length of time it takes Habib to walk to school on a particular morning. 


e) The number of times Habib is late for school in a particular term. 


2. The histogram shows the durations 
of the eruptions of the Old Faithful 
Geyser in Yellowstone Park over 


2 
a period of time. 5 
Sketch what you think the graph 
of the continuous random variable = 
= 
looks like. 
o 1 2 3 4 5 6 
Duration of eruption (minutes) m4 
3. The histogram shows the time between 0.05 + 
eruptions of the Old Faithful Geyser. 2 0044 
“i 2 
Sketch what you think the graph of the 3 0.03 4 
continuous random variable looks like. 5 0.02 4 
E oo 
0.00 4 


0 10 20 30 40 50 60 70 80 90 100 
Time between eruptions (minutes) 


4. The histogram shows the weights in grams of a sample of African grey parrots. Sketch what you 
think the graph of the continuous random variable looks like. 


f 
50 t 
40 
30 
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(ag Introduction to continuous random variables 


5.2 Probability density functions 


To be a probability density function (pdf), f(x) must satisfy these 
basic properties: 


e@ f(x) 20 for all x so that all probabilities are not negative 


e [tear = 1 - often f(x) is only defined over a small range, in which 


case the integral over that range will be 1. 


You can find the probability that a random variable lies between x = a and 
x= b from the area under the curve represented by f(x) between those 
two points. 


Example 1 

Show that f(x) is a probability density function where 
f(x) = 4(x- 3) for3<x<5. 

Find P(X < 4). 


f(x) > 0 for all x it is defined fa ; 


[free = {ke - 3) pax = [tx - 3] 


=(2-19)-(2-2)-1 


so f(x) is a probability density function. 


——— 
wiirrvsccead] /P | 
‘AI 
/ 


P(x<4) = 0.25 


Continuous random variables #ataI 


Example 2 
For the following functions, state whether they could be used as a pdf, and if not explain why. 


Te 


No — because there are No - because the area under Yes - there are no negative 
negative values for f(x) the pdf is 2.25 values and the total area = 1 


You can find the area under a curve y = f(x) between 


x = aand x = b by calculating the integral i f(x)-dx. See Pure 2 for revision, 
6 


If f(x) is a probability density function (pd: 
'b 
P(a<X<b) -[ f(x)-dx. 


Example 3 

f(x) =k(9-27) for-3<x<3 

Find a) the value of k and calculate P(-1 < X < 2) 
b) P(X=2). 


4 | = k((27 - 9) - (-27 —-9)) = 36k 


Ls 


36° 2 


~3((n-$)-(+-3)) 8-3 


b) Fora continuous distribution the probability of a single value is 0. 


If you are given a sketch of a pdf, you might be required to write down the 
function form of the pdf in simple cases. 


Probability density functions 


Example 4 
Express the pdf shown in this sketch as a function. 


Find the equation of the line through A(2, 0) and B(3, 2): 


The gradient is m = x oe 


so the equation is y = 2x +c. 
Substitute (2, 0):0=4+c¢,soc=-4. 
Equation is y = 2x - 4. 


The full form of the pdf is then 
f(x) =2x-4 for 2 <x <3 (and f(x) = 0 elsewhere). 


You know that the median of a data set is a value which has half the data less 
than or equal to it, and the same is true for probability distributions, so @ is the 
median of a distribution if P(X < @) = 0.5. 


In general this would be done by integration of the pdf using an upper limit 
of @, setting the value equal to 0.5 and solving the resulting equation for a, 
but in this course you are only expected to find the median in cases where it 
can be found by the direct consideration of an area. 


Example 5 
Find the median for the probability density function in Example 1: 


f(x) = 4(x-3) for3<x<5. 


shown are (@- 3) and F(o- 3), so using the 


formula for the area of a triangle gives 
$@ -3)F(a-3)=4 > (a-3 =2 5a =3+ V2. 


Alternative solution: 


=(#-3)-0-9)- 
G2 4 2 
=>o-—6a+7=0 


6+V36—28 342 


>a= = 
2 


>a=3+ 2 


Continuous random variables 


Exercise 5.2 


1. Which of the following functions could be probability density functions? 
a) YA b) yA 
t+ + | 
0] Y | x 0 | % 
2 
++ Mat 
2) ”* d) y 
4 
| x 
2 = 8 1 2% 
ay 
2. The graphs of two probability density functions are shown. 
YA ; 
aona | 
. | 


* 


2-15 -1-05 0 05 1 15 2” 


4| 
L| 
a) Find i) P(X<2) ii) P(|X|<1). __b) Find i) P(X<2) ii) P(X <2.5). 


3. Find the value of k for which each of the following functions can represent a 
probability density function (in each case f(x) is 0 outside the defined range). 


a) f(x)=ke -1l<x<1 b) f(x)=kx 05x53 
°) fx) =e +k 0<x<3 d) f(x)=ke O<x<1 


e) f(x) = 3 l<x<k 


Probability density functions 


4. The following functions represent a probability density function (in each 
case f(x) is 0 outside the defined range). 


a) f(x)=* O<x<2 Find P(X<1). 


2 
b) flx) = 2x O<x<4 Find P(1<X<2). 
°) fa) = 75 +k O<x<2. Findkand P(X> 1). 
d) f(a) = 2x7 O<x<2 


Find i) P(X<1) ii) P(X= 1.5). 
e) f(x)=ke  1<x<3  Findkand P(X <2.5). 
5. For each of the following probability density functions, give the equation. 


Make sure you define it for all values of x from —29 to 0, 


a) by 


05 1 15 2 25 3 * 0| 02 04 06 08 1 * 


6. For the probability density functions in the corresponding parts of 
question 5, find 


a) i) P(X>2) ii) P(X < 0.5) 
b) i) P(X>0.2) ii) P(X<1) 


5.3 Mean and variance of a continuous random variable 


‘The mean or expectation (expected value) of a discrete probability distribution is defined as 


= E(X) = Ypx 
For a continuous random variable 


u=E(X) = [otras 


where in practice the limits will be the limits of the interval over which 
f(x) is defined. 


Continuous random variables as 


Example 6 
f(x) =F(e- 3) for3<x<5. 


Ty= 4-3) for 3<x<5 


05 115 2 25 3.35 4/45 5 55 6* 


The mean is here - at 


Recall the definition of variance: Var(X) = o* = E[{X —E(X)}’]. 


You almost always use the computational form: o° = E(X*) — 4° where yz = E(X). 


For a continuous random variable E(X?) = i! x? f(x)-dx. 


Example 7 
Find the standard deviation for the pdf in Example 1. 


E(X?) = J 2° £(x)-doxr 


alee ele 13)-(u a 


Remember that w = E(X) = = 


3 9 
and the standard deviation of X is £ =0.471(3 s.f.). pe os clan 


Mean and variance of a continuous random variable 


so a? = B(¢) - # =19-(38) =2 


Example 8 

The continuous random variable X has probability density function 
f(x)=kx for0<x<5 

a) Find the value of k. 

b) Find the mean and variance of X. 

c) Calculate 
i) P(X>) and 
ii) P(X > + 0) where w is the mean and ois the standard 

deviation of X. 


x<0 


a) ={5y? O<x<5s0%x5*=1 =>ke% 


E(X2 


5 
x*| = 
lo 
10 


and o? = E(X”) — yw? =12.5- (2 


i) P(x> y)=1-F(H2)=1- 4 toy 


ii) Por w+o)=1-F{ 10 S + (2)} -o186650 


Advice on calculator use 

Note: You can work out a decimal value for 41 + o but you need to keep 
enough decimal places in it to give answers to 3 s.f. The best way of doing 
this is that if you calculate 44 + o = 4.5118... and that is the last calculation done on 
your calculator, then the exact value can be used by pressing the Ans key. 


Continuous random variables 


67 


Example 9 
The continuous random variable X has probability density function 

f(x)=Lo-x -3<x< 

f(x) =e) 02) for -3<x<3 

a) Find the mean and variance of X. 

b) Calculate i) P(X>2)and ii) P(Xx|> co), where ois the standard deviation of X. 


a) w=E(X)= faterax 
aA 
-[2@2-2)] -2(e_a)_(2 a 


(Note that this could have been found by the symmetry of the pdf.) 


E(X) = [toyar 


©= x pax -[{Lor - x)}-dx 


il op ae 
b \4 108) \2 108) 27 
ii) P(X| > o) will be 2 x P(X > o) by symmetry, and o = V1.8 from part a). 
P(X >) -{ fixyde= [£@-x7)}-ax 
= )\36 J 
-[t+- be | = (3 = 2z) 2 (Ce _ is } = 0.18695... 

AG TOS ally. meLOS 4 108 

and P(|X| > o) = 0.374 (3s.£.) 


Mean and variance of a continuous random variable 


Example 10 

The weekly petrol consumption, in hundreds of litres, of a sales representative 

may be modelled by the random variable X with probability density function 
f(x) = ax? (b - x) for0<x<2. 

a) Find the values of a and b if the mean consumption is 144 litres. 


b) Find the standard deviation of the weekly petrol consumption. 


a) The total probability is 1,so1= | f(x)-dx 
i= fore —x)dx= [ste - xt] 
lo 3h 4 
= (20 il} 
=| 3 42) : 


w= BO) = fafla)de 


lo 


= [are — x)}-dxx = [too —ax*}-dx 


=| Sbat 2 Fal = (4ab - 24) —9=144 
4 5 lo 5: 


ee the linear simultaneous equations in a and b gives a = 0.15, b = 4. 


b) oe ro 


=| am 
= [2° ae aa, = 
25 40 lo 25 40 


o? = 28 _ 1,44? = 0.1664 
25 


So the standard deviation of X is 0.408 and the standard deviation of the petrol 
consumption is 40.8 litres (3 s.f.). 


Exercise 5.3 
1. A probability density function is given by f(x) = % for4<x<5. 


Find the mean and variance of X. 


Continuous random variables Jas) 


2. The continuous random variable X has probability density function 
f(x) =kx for0<x<3. 
a) Find the value of k. 
b) Find the mean and variance of X. 
c) Calculate 
i) P(X>y) and 
ii) P(X > + o), where jl is the mean and ois the standard deviation of X. 


3. The continuous random variable X has probability density function 
f(x) =ke for0<x<5. 
a) Find the value of k. 
b) Find the mean and variance of X. 


c) Calculate 
i) P(X> yw) and 
ii) P(X > yu - 20), where is the mean and Gis the standard deviation of X. 


4. The continuous random variable X has probability density function 
f(x) =kx' for0<x<1. 
a) Find the value of k. 
b) Find the mean and variance of X. 
c) Calculate 
i) P(X <) and 
ii) P(\X-p| < o), where it is the mean and cis the standard deviation of X. 


5. The length, in metres, of jumps that Carl makes may be modelled by the 
probability density function 


f(x) =k(x-6)? for5.5<x<6.5, 
a) State the value of E(X). 
b) Find the value of k. 


c) Find the variance of X. 


vi Mean and variance of a continuous random variable 


6. Buses arrive regularly at a bus stop every 15 minutes. 


a) Gupta does not know the bus schedule so he arrives at the bus stop at 
random times. 


i) IfX is the length of time, in minutes, that Gupta has to wait for a bus, 
explain why f(x) = + for 0 < x <15 isa sensible model for the 
distribution. 
ii) Find the mean and variance of X. 
b) Shirma does know the bus schedule and aims to arrive shortly before a 
bus is due. If Y is the length of time, in minutes, that Shirma has to wait for a 
bus, and Y has probability density function given by f(y) =k(3-— y) forO0< y<3 
i) find the value of k 
ii) find the mean and variance of Y. 
7. The profit, X, in $1000s, made on a speculative investment may be modelled by 


the probability density function given by f(x) = k(9 + 8x — x°) for -l<x<4. 
=. 

a) Show that k= 750° 

b) Find the mean and standard deviation of the profit made on the investment. 

c) Show that the probability of making a loss on the investment is et 


d) ‘There is an alternative investment on offer with a guaranteed return of $1500. 
Find the probability that the speculative investment gives a better return. 


8. The diameters, in cm, of chips made in car paintwork by flying stones 
during resurfacing of a road may be modelled by the random variable X 
with probability density function f(x) = ax’(b — x) for 0 < x <3. 


a) Show that a= 4 and 6 = 3 if the mean diameter is 1.8 cm. 


b) Find the standard deviation of the diameters. 


c) Comment on whether you think the given pdf is likely to be a good model 
to describe the diameters. 


5.4 Mode of a continuous random variable 


The mode of a continuous random variable is the value of x for which 
f(x) is a maximum over the interval in which f(x) exists. 


‘This is either a stationary point, at which f’(x) = 0, 
or the end value of the interval over which f(x) is defined. 


A sketch of the pdf will help you to make sense of the mode. 


Continuous random variables 


There may not be a mode if no single value occurs more often than any other 
(sometimes extended to talking about two modes but not normally any more). 


Example 11 
f(x) = 50 = x°) for -3<.x<3 
Find the mode. 


The mode is 0. 


You could alternatively use differentiation: 


(x)= £0-2)> f'(x)=Z*=0 when x=0, 


Example 12 


Find the mode of the pdf from Example 1 in Section 5.2, 


f(x) =4(x-3) for3<x<5. 


Sketch the graph: 


The derivative of f(x) here is 4 which 


has no dependence on x so there will be 
no stationary value of f’(x). This does not 
mean there is no mode. 


The pdf increases steadily throughout the interval in which it is defined. 
The mode for the random variable is 5 (the upper bound of the interval 
in which f(x) is defined). 


Mode of a continuous random variable 


Exercise 5.4 


1. 


The continuous random variable X has probability density function given by 
f(x)=3(e-1F OS x <2, 

a) Sketch the probability density function. 

b) State the value of i) the mode and ii) the median of X. 


For the random variables with the following probability density functions, 
find i) the median 


ii) the mode. 


The continuous random variable X has probability density function given by 
f(x) = kx(10 — x) O0<x<7. 

a) Find the value of k. 

b) Find the mode X. 


Summary exercise 5 


1. 


A continuous random variable X has a a) Find P(1 <x <2). 
probability density function of the form b) Calculate E(X) and Var(X). 


flx) = AQ? + 4), 0<x< 1. ©) Calculate P(X < x) where = E(X). 
a) Calculate the value of A. 


b) Find P(0.5 < x < 1) to 2 decimal places. 
c) Calculate E(X) and Var(X). 


4. A factory produces discs to operate 
electronic games machines. The slots in the 
electronic games machines are 3cm long and 

Accontinuous random variable X has a the factory is requested to make discs whose 


probability density function of the form diameters are no more than 2.9cm and no 
f(x)=240,0<x<3 less than 2.7 cm. The probability density 
6 


function of X, the disc diameter, is 


a) Calculate the value of C. f(a) 3000 Biixahbn-Aih Beek 
= 1 — x) (x - 2.7), 2.7 <x< 3.1. 

b) Find P(1 <x < 2) to 2 decimal places. % 32 ay in 

¢) Calculate E(X) and Var(X). A disc is chosen at random. Find: 


i) the probability that the disc does meet 


A continuous random variable X has a ‘ ef ics 
the required specification 


probability density function of the form 
f(x)=1-50<x<2. 


Continuous random variables 


ii) the probability that the disc does not 
meet the required specification, but will 
still fit in the slot. 


. A continuous random variable X has 


probability density function given by 
f(x) = aad 0<x<1 


0 otherwise. 
Find 
a) P(X>0.5) 


b) the mean and variance of X. 


: EXAM-STYLE QUESTIONS 


a7 


1 05 ° 


‘The diagram shows the graph of a 
probability density function, f, where 
f(x)=k(1-x"),-I< x1, andf(x)=0 
elsewhere. 

a) Show that k = 0.75. 

b) State the value of E(X). 

c) Find the variance of X. 


Summary exercise 5 


. The random variable X has 


probability density function f, where 
f(x)=k(2—x), 1S x <2, and f(x)=0 
elsewhere. 

a) Find the value of k. 

b) Find the median of X. 


0.6 | 
0.5 4 
0.4 | 
03 4 
0.24 
0.1 4 
0.0} 
01234567 

The diagram shows the graph of a probability 


density function, f, where f(x) = Ly 1<x<7, 
x 


and f(x) =0 elsewhere. 
il 

a) Show that k= GH 

b) Find the exact value of E(X) and give 

Var(X) correct to 3 significant figures. 


. The diagrams show the probability density 


functions of four random variables P, Q, R 

and S. Each of the four variables takes values 

between 0 and 2 only, and their medians are 
,m, ,m, ,m, respectively. 


List the medians in order, starting with the smallest. 


fp) A a(a) 4 
154—_______, 154 


JOA 
15 +———_—__—_ 
R 
14 
0.5 + 
° 1 2 3 
Chapter summary 


e Continuous random variables are needed to describe situations involving length, time, mass, 
density, volume, etc. 
e A probability density function (pdf), f(x), must satisfy these basic properties: 


f(x) 2 0 for all x so that all probabilities are not negative 
f(x)-dx = 1 - often f(x) is only defined over a small range, in which case 


the integral over that range will be 1. 


e Fora continuous random variable: 
p=E@)= [fee where in practice the limits will be the limits of the interval over 
which f(x) fadennedl 
Var(X) = o° = E[{X —E(X)}’] but you almost always use the computational form 
o =E(X°)—yr where pt = E(X). 
EX?) = fe £(x)-dx 


e The median of a continuous random variable X is wif P(X < @) = 0.5. This can be identified 
by geometrical arguments and can also be found by using integration. 


Continuous random variables 


6 Sampling 


Objectives 
After studying this chapter you should be able to: 


Collecting data can be extremely costly, 
time consuming and even dangerous, so 
there is now a specialised area of research 
and development into sampling methods to 
provide cost-effective ways of producing the 
information required. 


Applications range from the mundane, such 
as acceptance sampling techniques, which are 
used in industry when taking delivery from 
component suppliers, to cutting-edge science 
such as monitoring the effects of global 
warming. 


Understand the distinction between a sample and a population, and appreciate the necessity 


for randomness in choosing samples. 


Explain in simple terms why a given sampling method may be unsatisfactory (knowledge 
of particular sampling methods, such as quota or stratified sampling, is not required, but 
candidates should have an elementary understanding of the use of random numbers in 


producing random samples). 


Recognise that a sample mean can be regarded as a random variable, and use the facts that 


E(X) = and that Var(X) = 2. 
n 


Use the fact that X has a normal distribution if X has a normal distribution. 


Use the Central Limit Theorem where appropriate. 


Before you start 
You should know how to: 
1. 


List the possible outcomes in a simple 
experiment, e.g. 

Ihave a set of four cards with the numbers 
1, 2, 3 and 4 on them. List the possible 
outcomes if I sample two cards at random 
without replacing the first 

1,25 1,3; 1,4; 2,1; 2,3; 2,45 3,1; 3,2; 3,4; 4,1; 
4,2; 4,3 


Skills check: 


1. 


I have a set of four cards with the numbers 
1, 2, 3 and 4 on them. List the possible 
outcomes if I sample two cards at random 
and replace the first before drawing the 
second card. 


6.1 Populations, census and sampling 


A sampling frame is a list containing all the units or elements which are 
the members of the population to be sampled. Because sampling is a 
practical activity, there will be times that the sampling frame will not 
match the population exactly, 


For example, a voting register may still include people who have died and 
may omit people who have completed the application forms but the forms 
are still being processed. A real-life example occurred in 2007 when it was 


found that some TV companies were not including all the entrants who 
phoned into premium rate competition lines when drawing the competition 
winners. 


The sample needs to be constructed so that bias is avoided, and ideally so 
that the sample will return an estimate very close to the true population 
parameter it is estimating. 


A sample looks at some of the populatio6.1 Populations, census and 
samplingn. 
A census examines the full population. 


Random sampling 


Each member of the population has an equal chance of being selected, and 
all possible combinations are equally likely. 


Simple random sampling means sampling without replacement. In practice 
this is usually done by using random numbers or some other random 
process, e.g. taking names out of a hat — if it can be done in a genuinely 
random way. 


You do not have to deal with different types of sampling methods in this 
course, but it is worth being aware that there are a lot of different methods 
developed to give the best possible estimates in different situations. 


Methods of collection 


Data can be collected automatically by electronic measurement devices, e.g. 
in a hospital or a production process, or manually through surveys, 
questionnaires, and by direct observation and recording. 


You need to take care when designing questionnaires, to ensure that 
questions are unambiguous and do not use language which will bias the 
respondent's answer. They should be as simple as possible to understand and 
answer in order to collect the required information. You also need to 
consider how the information is to be analysed to ensure that the data are 
collected in a form which makes the analysis easily accessible. 


Sampling 


Example 1 

The population is all the students at an 11-18 school. 

Suggest at least two different sampling frames which could be used for this population. 
Any system which lists all students in the school exactly once will be a 

sampling frame: 

Alphabetical list of all students in school. 

Alphabetical lists of students by year group. 

Alphabetical lists of students by tutor group. 


List of students by date of birth, with alphabetical ordering for any 
students with the same date of birth. 


e Any of these could take boys and girls separately and a different 
sampling frame would be produced. 


Example 2 


Suggest possible populations and sampling frames to investigate the following: 
a) whether a university has a gender bias in its intake procedures 
b) where people in a town do their eoceny shoeing 


a) This is ralbeecl straightforward. tthe atsjaulltaioya is agent who applied to the university - for 
one or more years depending on how extensive the investigation will be. The corresponding 
sampling frame will be a list of these people (possibly alphabetical, but any list will do). 


b) This is much less straightforward and highlights some of the practical problems faced in 
sampling. The population of interest is the people in the town who shop for groceries, but this 
is not very precisely defined — and there is no easily accessible way to produce a sampling 
frame which lists these people. Whatever registers (if any) which are held for voting, or tax 
collection, might give something which can be used as a ‘best available’ sampling frame — but 
these would vary from country to country. Any results obtained from surveys in a situation 
like this need to be treated with caution. 


Exercise 6.1 
1. a) Explain what you understand by the terms 
i) population 
ii) census. 
b) Give an example of bias in sampling. 
2. a) Explain what you understand by the terms 
i) sampling frame 


ii) sampling unit. 


Populations, census and sampling 


b) We want to collect data on age, ethnicity, education level and 
sex for adults in the UK. 


i) Describe the population, and a possible sampling frame in 
this situation. 


ii) Explain why the sampling frame is unlikely to exactly match 

the population. 

3. Describe the important characteristics of a simple random sample. 

4. A bank hires a market research company to conduct a survey of its 
customers to see how it could improve its service. 

a) State what the population is in this situation. 


b) Suggest a possible way to construct a sampling frame. 


c) List any difficulties you can identify in making sure that any 
individual customer does not appear more than once in the 
sampling frame. 


6.2 Advantages and disadvantages of sampling 


In some circumstances taking a census is simply not possible or desirable, 
e.g. in the run-up to elections, opinion polls are taken to provide some 
information on what people think on some issues. Here the population is 
everyone who could vote in the election but the amount of time 

(and money) required to ask everyone would be far too high - and the 
time required to process the information might mean that the election 
had already taken place before the analysis was available. 


Moreover, even asking everyone would not guarantee a correct prediction 
because people are under no obligation to respond truthfully to this sort of 
opinion poll, or they may answer truthfully at the time and later change 
their mind and vote differently. 


Sampling is generally much cheaper than taking a census, and almost 

always it is cost effective provided good sampling procedures are followed. 

A high proportion of the information available from a census can often be 
obtained from a sample which costs a fraction of what the census would cost. 


In certain situations, testing items results in them being destroyed and a 
census is not appropriate. In many situations, it is important to have the 


analysis of the sample available quickly in order to inform decision making. 


However, a sample will always give incomplete information while a census 
tells you everything about the population. 


Sampling 


The key in sampling is to ensure the best possible quality of information 

for the resources that are available, and in particular to avoid sources of 

bias. There is natural variation between individual sampling units, which 

is due to chance, but there are often systematic differences between groups 
of sampling units with one or more shared characteristics, e.g. there are often 
differences between males and females. 


Bias occurs where there is tendency for a sample selected to overrepresent or 
underrepresent certain groups — it is a property of the sampling method 
rather than the sample itself, e.g. if a sample of 50 is taken by simple random 
sampling using a council tax register which has equal numbers of males and 
females on it, then even if the sample ends up with 22 men and 28 women 
there is no bias present. However, if the method used is to select a property 
at random and to take the first named person on the register at that property, 
then the sample will be biased because the first named person will more 
often be male. There are a number of common sources of bias: 


@ Subjective choice by person taking the sample 

For example, a person conducting a survey in a town centre may ask only people 
‘looking respectable’ because they are thought more likely to agree to participate. 
e Self-selection 

For example, radio phone-in surveys are strongly biased. Only people who feel 
particularly strongly on the issue will participate. 

e Non-response 


Almost all surveys have an element of ‘self-selection’ in them because 
there is little the person conducting a survey can do to require people 
to take part. The response rates for postal surveys are critical in knowing 
how worthwhile any information will be when the analysis is done. 


Companies will often offer inducements - entry into a prize draw, for 
example - to encourage people to take part in their surveys. 

e Convenience sampling (opportunity sampling) 

Items are chosen by which is the easiest rather than in any structured 
and systematic attempt to construct an unbiased sample. 

e Sampling from an incomplete sampling frame 


For example, if you use a telephone directory or a list of registered voters as a 


sampling frame when the population is the adults in a community then 
there will be groups of people who are seriously underrepresented. 


You are not expected to know details of different sampling methods, but it may help 
to get a feel for what the issues are by reflecting on any good or bad things you can 
see in different methods. There is a section at the end of this chapter which lists very 
briefly the main features of a number of different sampling methods. 


fmm Advantages and disadvantages of sampling 


Exercise 6.2 

1. An airline sends out a postal questionnaire to enquire about customers’ 
views on a new route they are introducing. They say that all completed 
questionnaires will be entered into a draw to win free flights on this 
new route. 


a) Explain why the sample is biased. 
b) Explain why the airline might make this offer. 


2. Give an example of a convenience sample. 


3. A manufacturer of mobile phone batteries wants to know how long 
they will operate for before needing to be charged. 


a) Explain why a sample should be used rather than a census. 


‘The batteries are packed in boxes of 50 as they come off the production 
line. A manager suggests taking one of the boxes and sending them to 
be tested as the sample. 


b) Assuming 50 is a reasonable size of sample to take, do you think 
this is a good way to choose the batteries to be tested? Give any 
advantages and disadvantages you can see with the method and 
suggest an alternative if you do not think it is a good way to do it. 


4. A bank hires a market research company to conduct a survey of its 
customers to see how it could improve its service. 
a) Explain why a sample is more appropriate here than a census. 


b) Can you think of any reasons why the bank might not want to just 
take a simple random sample of its customers? 


6.3 Variability between samples and use of random numbers 


Ifa number of people each asked a random sample of four pupils 
in your class some questions, how variable would the responses be? 
Does it depend on the question? 


Think what answers you would get if you asked: 
How long did it take you to get to school this morning? 
How many DVDs do you own? 


Do you own a mobile phone? 


What is your favourite type of film at the moment? 


Sampling 


Here are the answers from a dozen pupils at one school. 


Pupil Time to getto| Number of | Owna mobile Favourite type of film 
identifier | school (mins) DVDs phone 
A 2, 43 ¥ Action 
B 15 25 ¥ Comedy 
Cc 10 21 ¥ Horror 
D 25 26 Y Action 
E 4 0 ¥ Drama 
F 10 36 Y Comedy 
G 5 89 Y Horror 
H 10 27 Y Horror 
I 15 34 Y SciFi 
J 15 29 Y Action 
K 20 31 bg Drama 
L. 5 18 ¥ Comedy 
f you, and five friends, had each taken a sample of size 4 chosen at random 
from the pupils in this class (and this is a very small class really), would 
you all have got the same results? How much difference would there be? 
s it the same for all the questions that were asked? 


You could put the letters A-L on slips of paper in a hat and each draw four 
out at random. There are other ways you can do the same thing using 
random numbers - either from a table, or using a computer or calculator 
to generate them. 


Because there are 12 to choose from you need to use a two-digit number, 
but there are actually 100 two-digit numbers (00 to 99). The simplest way 
is to decide that you will ignore 00, or anything from 13 to 99, and 

take another number. If you get a number you have already chosen you 
will pick another number until you have four different numbers. 


Variability between samples and use of random numbers 


39 | 63 | 46 | 23 | 49 | 10 | 52 | 06 
79 | 64 | 42 | 80 | 00 | 05 | 95 | 35 
53 | 65 | 72 | 41 | 61 | 48 | 24 | 96 
81 | 60 | 06 | 79 | 31 | 33 | 18 | 84 
39 | 68 | 17 | 43 | 56 | 27 | 92 | 37 
80 | 36 | 17 | 25 | 27 | 44 | 96 | 33 
86 | 22 | 80 | 24 | 93 | 30 | 15 | 11 
31 | 15 | 61 | 02 | 81 | 93 | 34 | 67 


Starting at the top left and reading across, my sample numbers are 
10, 6, 5 and 11 (6 occurred a second time, but was ignored). 


My sample then would be: 
Pupil Time to get to Number of | Owna mobile ‘i 
jeer school ri DVDs phone Bavourite type of ili 
J(10) 15 | 29 Y Action 
F(6) 10 | 36 Y Comedy 
(5) 4 | 0 Y Drama 
K(11) 20 | 31 Y Drama 


In my sample: 


e the average time to get to school is 12 minutes (to the nearest minute) 


e the average number of DVDs owned is 24 


e 100% own a mobile phone 


e there are three different film genres, with Drama appearing twice. 


A simple random sample is the equivalent of putting names in a hat and 
drawing them out blindly. All individuals in the population are equally likely 
to come out at any stage, and all possible combinations are equally likely. 


Some simple arithmetic can allow you to be much more efficient in the use 


of random numbers. If you ignore only 00 and 97, 98, 99 in the 100 


two-digit numbers, and take the remainder when the number chosen is 
divided by 12, then all 96 numbers from 01 to 96 will map onto 1 to 12 with 


equal likelihood. 


In this case you will get a different sample: 


39 | 63 | 46 | 23 | 49 | 10 | 52 | 06 


79 | 64 | 42 | 80 | 00 | 05 | 95 | 35 


53 | 65 | 72 | 41 | 61 | 48 


Sampling 


39 gives a remainder of 3 — pupil C, so does the next number in the list (63), 
so it is passed over, then 46 gives a remainder of 10 — pupil J, 23 gives a 


remainder of 11 — pupil K, and 49 gives a remainder of 1 — pupil A. 
Pupil Time to get to Number of | Owna mobile Favourite type of film 
identifier school (mins) DVDs phone 
C(3) 10 21 ¥ Horror 
J(10) 15 29 Y Action 
K(11) 20 31 Y Drama 
A(1) 2 43 ¥ Action 


In this sample: 

two of the pupils are the same as in the previous sample, and two are different 
the average time to get to school is again 12 minutes (to the nearest minute) 
the average number of DVDs owned is now 31 


100% own a mobile phone 


there are three different film genres - with Action appearing twice this 
time — but not the same three: Action appears and Comedy does not. 


Tables of random digits are often presented in blocks of 5 (see below) to 
enable you to use them flexibly - for example, if you want two-digit numbers 
it is probably quicker to take the first pair and the last pair out of each 

block of five digits, than to be using all the digits. As long as you decide 

the rule by which you will identify the random numbers to use then 

you can do whatever you find easy. 


41089 57286 78925 86189 48509 86176 


45087 14740 92741 10088 41571 97806 
28446 39969 32627 54204 32209 59745 


Advice on calculator use 

Random function on calculator (actually pseudo-random numbers): most 
calculators have at least one generator on them - the first one included is 
between 0 and 1, often to 3 decimal places, sometimes to rather more. These can be 
used to generate random integers between a and b by multiplying by (b —a + 1) and 
adding a, taking the integer part of the answer. For a number between 1 and 25 you 
multiply by 25, add 1, and take the integer part, e.g. 0.341 — 8.525 — 9.525 > 9.0n 
a programmable calculator a series of such random integers can be produced by Int 
(Rnd# x 25 + 1) and EXE repeatedly generates the series required. 

You should make sure that you know what functions are on your calculator and how to 
use them. 


Variability between samples and use of random numbers 


Exercise 6.3 


1. 


2. 


A Head of Year wants to find out where pupils would like to go on 

an end of year trip. He decides to ask a sample of 10 pupils out of the 
year group, which has 90 pupils in it. Which of the following will give 
him a simple random sample? 


a) Put all the pupil names in a hat and draw out 10 names. 


b) List pupils alphabetically, choose a number from 1 to 9 at random 
to give the first pupil, and then take every 9th pupil in the list from there. 


c) Take the first 10 pupils who arrive at the next year assembly. 


d) List the pupils in order of their total examination score and take the 
first 10 pupils on the list. 


e) List the pupils in order of their total examination score and assign 
each a number from | to 90. Use a set of two-digit random numbers 
to choose 10 pupils from the list (ignore 00 and 91-99 if they appear, 
and any repeats, and continue until you have 10 different numbers 
between 01 and 90 inclusive). 


Use the table of random numbers below to generate simple random 
samples for each of the following situations. 


a) A sample of size 10 from a population of 90, starting at the top left 
and reading across. 


b) A sample of size 10 from a population of 90, starting at the top left 
and reading down. 

c) A-sample of size 5 from a population of size 42, starting at the top 
left and reading across. 


d) A sample of size 6 from a population of size 38, starting on the 
left of the 3rd row and reading across. 


52 98 44 17 71 20 17 #63 47 88 22 O02 18 9 93 
54 73 99 99 17 50 60 78 94 55 73 39 58 00 98 
07 67 88 88 97 76 20 65 22 97 16 9% 60 11 65 
08 64 #%14 80 32 76 25 84 30 92 59 OL 65 49 68 
43 71 56 2 82 26 27 16 95 87 77 03 75 35 18 
49 46 58 74 98 26 73 80 79 19 50 87 35 43 63 
24 35 79 43 81 42 #34 24 O1 41 10 63 74 97 65 


Sampling 


3. Your calculator probably has a random number generator Ran# which 


Advice on 
generates a three-digit decimal. Use this to take another five samples eB Calailntones 
(each of size 4) from the class data in the section above, and generate tetsiets 
similar summaries of the results to the one listed above. Many calculators will 

display 0.230 only as 
Is there any difference in the variability of the answers to the aes Be 0.600 oa 06, 
four questions asked in the survey? so if the display shows 
less than 3 decimal 
4. a) Using the random number generator on your calculator, take a places add one or 


simple random sample of size 4 from the table below, and calculate | two 0s. 
the mean of the values in your sample. 


Person| 1|2|3]4|5|6| 7] 8 | 9 | 10] 11] 12] 13] 14] 15] 16] 17] 18] 19| 20 
Value |38| 39 | 40| 41 | 42|53|54| 55/56 | 57| 68 | 69|70| 71 |72| 83 | 84| 85 | 86 | 87 


b) Take four more simple random samples from the same data set and 
calculate the mean of each (you will use these again in the next 
exercise so don't lose them). 


5. Using the random number generator on your calculator, take 
a simple random sample of size 10 from the table below, and calculate 
the mean of your sample. 


40 51 53 47 39 38 33 35 46 30 45 44 36 33 47 
53 38 31 45 34 38 49 51 53 45 47 53 41 40 31 
42 52 45 38 36 52 38 51 50 31 32 31 49 54 34 
50 40 39 50 32 43 52 30 36 44 42 44 45 50 54 
54 54 41 39 40 46 41 45 43 48 44 37 42 51 53 
49 49 44 37 30 54 42 33 46 44 


6.4 The sampling distribution of a statistic 


In Sections 6.4-6.7 you can assume that, when it comes to analysing results 
of sampling, a simple random sampling method has been used. This ensures 
that all possible combinations of sampling units are equally likely to be 
chosen in the sample, that the method is unbiased and that the results 

in Sections 6.4-6.7 hold. 


If you throw a pair of dice repeatedly and take the average score on 
the two dice, you can generate the probability distribution for this 
quite easily. 


fom ©The sampling distribution of a statistic 


The probability distribution of X is: 


x 1 15 | 2 | 25: | 3 | 35| 4 | a5] 5. | S5 | 16 


1 2 2 4 5 6 5 4 3 2 
P(X =x) 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 


Think of sampling throws of a fair die. If you throw the die n times and 
take the average of the scores, you have taken a simple random sample and 
for any value of n (in theory) you could work out a probability distribution 
of the mean. 


The mean of the n throws is called a statistic (it has to depend only on the 
values observed in the random sample), and the probability distribution is 
known as the sampling distribution of the statistic. 


Examples of other statistics are the median, the standard deviation of the 
sample, the range, etc., and the sampling distributions for some of these can 

be surprisingly complicated. Fortunately the ones which are most useful 

have relatively simple distributions, or can be approximated by something simple. 


Example 3 
‘The heights of 15-year-old males in a large town have a mean pt and a 
standard deviation o. The heights of a sample of twenty 15-year-old males 


are recorded as x,, X,, Xy ++ Xyy- 


Which of the following are statistics - for any which are not, explain why not. 


a) x, b) Sx, - uy 


isl i=l 


20 20 
c) SG — X), where ¥ = ee, +20 


i=l isk 
the largest of the values x,, x, x 

ae 
the range of the sample values f) = us 
the number of values Hay Kay Nay o2s Kay which are greater than u 


the number of values x,, x, x, ..., *,, which are greater than x. 


A statistic must depend on some or all of the values of sample observations, 
and not on anything else, so a, c, d, e and h are all statistics. 
b and g need the value of 1, and f needs the values of 1 and o, so these three are not statistics. 


Sampling 


One of the simplest situations is when the ‘measurement of each member 
of a sample can be modelled by a random variable taking the value 1 when 
some criterion is satisfied and 0 when it is not: e.g. if a person visiting a 
health centre is a male, or if a light bulb on a production line is faulty, or a 
bank manager intends to retire before their 60th birthday. 

In this case, the sampling distribution of a sample of size n will be the 
distribution of a binomial random variable with parameters n and p, 

where p is the probability the criterion is satisfied for any individual. 


Example 4 
20% of the trainee teachers in a large college are male. A random sample of 15 teachers 
is taken from the college. The random variables X; i = 1, 2, 3,..., 15 are defined as 
lif the ith trainee teacher is male 
ne {o if the ith trainee teacher is female. 


15, 
a) Write down the distribution for SEG 
15 = 
b) Find (dx, = a} 


1 


15 15, 
c) Give the values of (Sx } and var( 3x } 


isl 


a) This is a binomial distribution with n = 15 and p = 0.2. 
b) 0.250 


15 15 
c) “(dx\] = np =15x 0.2 = 3; var( 3x; = npq = 15 x 0.2 x 0.8 = 2.4 


i=l i=l 


In other situations, you may be able to derive a sampling distribution 
directly from listing possible outcomes and their associated probabilities. 


The sampling distribution of a statistic 


Example 5 


A bag contains three 50 cent coins, two 20 cent coins and a 10 cent coin. 
Two coins are to be taken from it. Find the sampling distribution of the 
mean value of the coins taken. 


Construct a table showing the 30 (equally likely) possible outcomes of 
taking two coins from this bag, and what the mean value is, to give the 
sampling distribution: 


x 50 50 


50 50 
50 
50 50 
20 35 
20 35 
10 30 


xX 0) | 235 15 


6 12 4 
30 30 30 


P(X =x) 


From this very simple situation, you get a quite strange looking distribution: 
Sampling distribution choosing two coins 


0.45 
0.4 
0.35 
0.3 
0.25 
0.2 
0.15 
01 
0.05 
0 


15 20 30 35 50 
Average value of coins (cents) 


Example 6 


How would the sampling distribution be different if it was a really large bag 
of coins which had 50 cent, 20 cent and 10 cent coins in the ratio of 3:2:1? 


> Continued on the next page 


Sampling (EE) 


You could model this by thinking of using the same bag, but replacing 
the coin after noting the value. 


The table and probability distribution then looks like this: 


x 50 50 50 20 20 10 


50 50 50 50 35 35 30 


50 50 50 50 35 35 30 


50 50 50 50 35 35 30 


20 35 35 35 20 20 15 
20 35 35 35 20 20 15 
10 30 30 30 15 15 10 


x 50 | 35 | 30 | 20 | 15 | 10 
a9} wloa| «| aw | a 
P&X=x)| 36 | 36 | 36 | 36 | 36 | 36 


Sampling distribution choosing two 
coins from large bag 


Probability 
o 
iS 


10 15 20 30 35 50 
Average value of coins (cents) 


Exercise 6.4 
1. The incomes of doctors working for a large health organisation have a 
mean and a standard deviation o. The incomes of a sample of 20 
doctors working for the organisation are recorded as Kp Kip Mig vrs Xe 
Which of the following are statistics? For any which are not, explain why not. 
a) x, -x, 


b) SG — 80000)? 


i=l 


tli The sampling distribution of a statistic 


20 20 
c) DY, — x), where ¥ = 4 +20 
Py 


ia 
d) the median of the values Dig Hi OB sey 
xX-— 
e) us 
o 


One third of the members of a fitness club are over 40 years old. 
A random sample of 20 members is taken from the club. 
The random variables X;i=1,2,3,..., 20 are defined as 


lif the ith member is over 40 
‘~~ \0 if the ith member is not over 40. 


20 
a) Write down the distribution for x 
isl 
20 
b) Find [dx = 2} 
i=l 


20 20 
c) Give the values of ( X, and var( ¥ X, } 


i=l = 


‘Three quarters of the members of a cycle club own more than one bike. 
A random sample of 10 members is taken from the club. 


The random variables X; i = 1, 2, 3, ..., 10 are defined as 
1 if the ith member owns more than one bike 
'” |0 if the ith member does not own more than one bike. 


10 


a) Write down the distribution for xX 


b) Find [dx < 7} 


10 10 
c) Give the values of (d X, } and var( X, } 


i=l i=l 
A video game machine takes tokens for 1 game and for 3 games. 
20% of the tokens used are for 1 game. 
a) Find the mean p for the number of games per token. 


A random sample of 3 tokens is taken when the machine is emptied. 
b) List all possible samples. 


Sampling 


6.5 Sampling distribution of the mean of repeated 
observations of a random variable 


In Chapter 3 you looked at linear combinations of random variables and had 
the following results: 

1) E(aX + b) = aE(X) +b 

2) Var(aX + b) = a’Var(X) 

3) E(aX + bY) = aE(X) + bE(Y) 

4) Var(aX + bY) = a’Var(X) + b?Var(Y) [if X, Y are independent] 

If n independent observations are taken of a random, variable X which has 
mean pt and variance o°, and X = {¥x,] is the sample mean, then Xisa 
random variable and we can apply the results above to deduce that 


E(X) = ws Var(X) = 


v% = X,+ X,+... +X, so applying result 3 above gives 


isl 


E 


> Jp Bar eae ep =m 
i=l 


And applying result 4 above gives 


var Sx - Var(X,+ X, +... +X,)=07? +07? +++07 sno? 


Then we can apply results 1 and 2 above to go from the expectation and 
variance of the sum to the expectation and variance of the sample mean. 


E(X) = (25x } = Lef Sx, ] =H ey 


Var(X) = var(t 
n 


i=l 


Example 7 


i) Find the expectation and variance of the score showing on a fair die. 
ii) Find the probability distribution for the mean score when two fair 
dice are thrown. 
iii) Find the expectation and variance of the mean score showing on two fair dice. 
iv) Show that the expectations in i) and iii) are equal and the variance in 
i) is half of that in i). 


> Continued on the next page 


Sampling distribution of the mean of repeated observations of a random variable 


(1+2+3+4+5+6)= 


E(X°) = 2+ 4+9416+25+36) = 2 = Var(X)= 


ii) From the start of Section 6.4 the distribution is: 


x 1 | US| 2 | 25.) Sj) 35 4.5 


Sy 1 2 3 4 5. 6 5 4 
P(X=*) 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 


iii) E(X)=E(143+6+10+15+21+20+18+15+11+6)=3.5 


E(X?) = (144.5412+25+45+73.5+80+81+75+060.5+36)=32 


= Var( X)=32-3.5° = 


24 


iv) The mean is 3.5 in both cases and 3 = B= 2 2 as required. 


Example 8 
A random variable has probability distribution given by 


x{1 |2 {3 |4 
p | 5k | 2k |k | 2k 
i) Show that k = 0.1 and calculate E(X) and Var(X). 


ii) If X is the mean of a randomly selected sample of 5 observations of X, 
write down the expectation and variance of X. 


i) 5k+2k+k+2k=10k 


x 1 2 Bi 4 
p | 05/02/01 | 0.2 


E(X) =1x05+2x0.2+3x01+4x0.2=2 
E(X*) =1x0.5+4x0.2+9x0.14+16 x 0.2 = 5.4 => Var(X) =5.4-2?=1.4 


it) E(X)=E(X)=2,Var(X) = 1var(x) = 14 =0.28 


Exercise 6.5 


1. For the following random variables, work out the mean and variance of 
the mean of a sample of 10 independent observations of the random variable. 


a) E(X) =5; Var(X) =4. 
b) E(X) = 26.3; Var(X) = 


Sampling 


c) Zisarandom variable with probability distribution given by 


d) Xisacontinuous random variable with pdf given by 


f(x) =(9-2°) for —3<x<3. 


2. X isa random variable with mean 42.5 and variance 23.1. 


a) Find the variance of the mean of a sample of 15 independent 
observations of X. 


b) What is the minimum size of sample needed in order that the 
variance of the sample mean is less than 1? 
3. A fair spinner is equally likely to land on any of four sections. 
The sections score 1, 4, 6 and 9 respectively. 
a) Find the mean and variance of the score obtained on a single spin. 


A random sample of 12 spins is taken and X,, denotes the mean of the 
12 scores obtained. 
b) Find the mean and variance of X,,. 


6.6 Sampling distribution of the mean of a sample from a 
normal distribution 


In Section 4.2, you saw that the sum of two normal random variables was still a 


normal random variable. It follows that the distribution of X =1yx, when 


F im 
X ~ N(u, 6”) will be X ~ x(a =| — that is, still normal, with the same mean 


n 


as the underlying population but the variance will be reduced by a factor of 4. 
n 


Example 9 

Packets of cereal are labelled as containing 500 grams of cereal. 

The actual contents can be modelled by a normal distribution with 
mean 503g and standard deviation 7 g. 


What is the probability that the mean contents of a randomly selected 
sample of 10 packets will be under 498 g? 


The mean stays at 503, but the variance is reduced by 


a\_. ¥ we 
ae N(503, 7)>X~ n{s02,2) a factor of 10 because the sample size is 10. 


(X<498)=0) 498—503 | _ @(-2.259) This is just the standard normal distribution probability 
as calculation from S1 Chapter 9. 
vio 


= 1 - (2.259) = 1 — 0.9881 = 0.0119 = 1.2% 


Sampling distribution of the mean of a sample from a normal distribution 


Example 10 


The volume of liquid dispensed into bottles is normally distributed with 
mean 500ml and standard deviation 6.2 ml. 
i) Arandom sample of 5 bottles is taken. What is the probability that 

the mean volume in the 5 bottles is at least 504 ml? 
ii) Another random sample is to be taken. What is the minimum sample 

size needed for there to be no more than a 1% probability that the 

mean volume is greater than 501 ml? 

: The mean stays at 500, but the variance 
i) X ~N(500, 6.2773 X~N (s0. s2) ~i—_ is reduced by a factor of 5 because the 
- sample size is 5. 


P(X > 504) =1- a =1- (1.443) 


= 1 - 0.9255 = 0.0745 = 7.5% 


ii) X-N(500, 62°) > X,~n(s00, £2) 
n 


P(X, >501)< 0.015 Mee 0.99 


vn 2.326 is the z-score cutting off the top 1% of a 

=> 501-500 5 5 396 normal distribution. Don't forget that the standard 
6.2 error has the factor of vn and to find 7 you need to 
vn square — and then take the integer above the solution 


=> Jn > 2.326 x 6.2 = 14.42... = n> 207.91... 


So a sample of size 208 is needed for there to be no more than a 
1% probability that the mean volume is greater than 501 ml. 


Exercise 6.6 


1. The contents of bottles of water are normally distributed with mean 
600 millilitres and standard deviation 7.2 millilitres. 
a) Give the distribution of the mean content of a random sample of 6 bottles. 
b) Find the probability that the mean content of a random sample of 
6 bottles is less than 597 millilitres. 


2. Packets of biscuits are labelled as containing 350 grams. The actual 
contents can be modelled by a normal distribution with mean 
352 grams and standard deviation 4.5 grams. 
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a) Find the probability that the mean contents of a random sample of 
10 packets is at least 350 grams. 


b) What is the smallest size of random sample for which there is 
probability of under 1% that the mean contents is under 350 grams? 


3. X~ N(w, o ). X, is the mean of a random sample of n independent 
observations of X, and P(X, - H| > 0.250) < 0.1. 
a) Find the smallest possible value of n. 


b) For this value of » find P(X, - H| < 0.10). 


4. The lifetime of Sooperstrong batteries is normally distributed with 

mean 85 hours and standard deviation 9.2 hours. 

a) Find the probability that the mean lifetime of a random sample of 
25 Sooperstrong batteries is less than 83 hours. 

‘The lifetime of Powersure batteries is normally distributed with 

mean 83 hours and standard deviation 2.1 hours. 

b) Find the probability that the mean lifetime of a random sample of 
25 Sooperstrong batteries is shorter than the mean lifetime of a 
random sample of 5 Powersure batteries. 


5. The number of characters on a page of a book can be modelled by a 
normal distribution with mean 1983 and standard deviation 44.7. 
Find the probability that the average number of characters per page in 
a random sample of 8 pages is more than 2000. 


6.7 The Central Limit Theorem 


In Section 6.5 you saw that the expectation and variance of the mean of a 
sample of independent observations of a random variable X are known 
exactly from the expectation and variance of the random variable X. In 
Section 4.1 you saw that the distribution of the sum of two independent 
Poisson random variables is still a Poisson, and in Section 4.2 that the sum 
of two normal random variables is another normal, but in general the 
distribution of the sum of repeated observations of a random variable is not 
the same distribution as the original population. You have seen examples 
of this with the total score on two dice: for one die the distribution is 
uniform — all observations are equally likely - but for two dice the 
distribution is triangular shaped — starting at 2 it increases to a peak at 

the middle value of 7 and then decreases down to 12. 


This means that, while you know what the expectation and variance will be 
of the sample mean generally, you cannot make any probabilistic statements 
in the way that you could in Examples 9 and 10. 


ela The Central Limit Theorem 


The Central Limit Theorem (CLT) says that the distribution of sample means 
becomes approximately normal as the sample size increases, no matter 

what the underlying population is, and at this level an arbitrary cutoff of 

n = 30 is applied as a standard rule of thumb as to what sample size is 
needed in order to use the CLT. 


It is worth exploring this a little further however, to get a firmer 
understanding of the theorem. For a uniform distribution, or for a 
symmetric binomial distribution (and for almost any reasonably symmetric 
population) the distribution of sample means is approximately normal 
with much smaller sample sizes than 30. 


There are two important facts which get lost sight of very often: 


e The CLT says nothing about the standard error of the mean - that 
result is derived from algebraic manipulations of the definitions of 
mean and variance ~ the CLT is solely about the probability 
distribution of the sample mean. 


e For any underlying population, increasing the sample size will 
make the sampling distribution of the mean look more like the 
normal. No matter how strange the underlying distribution is, by 
the time the sample size gets to 30 the distribution will be passably 
close to the normal distribution. 


Let’s look at some instances: 


Distribution of sample means. 
140 


120 

100 Samples of size 8 from 
80 uniform distribution 
60 
40 


0 10 20 30 40 50 60 70 80 90 100 
Samples = 1199 Mean=50.0 St. dev. = 10.2 


This is a screenshot of a simulation after almost 1200 samples of size 8 had 
been taken from a uniform distribution. The histogram of the sample mean 
values shows the characteristics of the normal bell-shaped curve. 
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What about a non-symmetric distribution? The graph below shows the pdf 
of the exponential distribution with a mean of 20. 


Ax), 
0.05 
0.04 
0.03 
0.02 
0.01 


And the screenshot below shows a simulation after almost 1300 samples 
of size 8 had been taken from this exponential distribution. 


Distribution of sample means 
250 


100 Samples of size 8 from the 
exponential distribution 


0 10 20 30 40 50 60 70 80 90 100 
Samples= 1299 Mean = 20.0 St. dev. = 6.9 


For a sample size of only 8, the distribution has moved a long way towards 
the normal but not far enough yet that you would say that the normal is a 
good approximation - it still has a noticeable positive skew. 


By the time the sample size is 20, the skewness has largely disappeared 
from the sampling distribution, and using a normal distribution would 
seem reasonable. 
Distribution of sample means 

350 


300 
250 


Samples of size 20 from the 
exponential distribution 


0-+m TOT 
0 10 20 30 40 50 60 70 80 90 100 
Samples= 1300 Mean= 19.9 St. dev. = 4.2 


The Central Limit Theorem 


Example 11 


On a flight there are 43 bags checked in. Historical records show that on 
that route the mean weight of checked bags is 12.1 kg with a standard 
deviation of 3.8 kg. 


What is the probability that the mean contents of the checked bags on this 
flight will be under 11 kg? (You may treat the bags on this flight as though 
they are a random sample of bags checked in on that route.) 


Even though you know nothing 
about the distribution of the weights 


Random sample of size 43 means X ~ N(124, 2a of bags themselves, the Central 
approx 43 Limit Theorem allows you to use 


P(X<11)= ("se = 0(-1.898) this approximation. 
3. 
V3 
= 1 - B(1.898) = 1 - 0.9712 = 0.0288 = 2.9% 


o 


Exercise 6.7 


1. The times taken in the cycling phase of a triathlon have mean 
43.2 minutes and standard deviation 4.3 minutes. Find the probability 
that the mean cycling time taken by a random sample of 50 competitors 
is over 44 minutes. 


2. The mean score in the 100 metres discipline of the decathlon is 912 points 
and the standard deviation is 42 points. Find the probability that the 
mean score of 45 randomly chosen decathletes in the 100 metres is 
over 900 points. 


3. The times for a particular journey during rush hour in a large city have 
mean 23.3 minutes and standard deviation 8.9 minutes. 


a) Find the probability that the mean time taken for a random sample 
of 60 of the times for this journey is under 25 minutes. 

b) If you wanted to know the probability that the mean time taken for a 
random sample of 10 of the times for this journey was under 25 minutes, 
explain why you would not be able to give a reliable estimate. 


4. X~ Po(7). A random sample of 72 observations of X is taken. 


i) State the approximate distribution of the sample mean, X,,. 


ii) Find the probability that X,, is greater than 6.5. 
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5. Xw~ B(10,0.3). A random sample of 80 observations of X is taken. 
i) State the approximate distribution of the sample mean, Xo. 
ii) Find the probability that X,, is greater than 3.25. 


6.8 Descriptions of some sampling methods 


Systematic (or purposive, or periodic) sampling: Choose a number at 
random from 1 to n, and then take every mth value thereafter. This means 
that all units have an equal chance of being selected, but it does not 
constitute a random sample, as not all combinations are equally likely. 
There is a particular danger with this method, especially for ‘production 
line sampling’ as any faults with a particular machine in the system will 
affect items regularly, and the systematic method of choosing the items 
to be sampled means that you are very likely either to miss them all or hit 
them all (or half or a third of them). 


Stratified sampling: This is where the population has identifiable groups 
(strata) within it for which we have reason to believe that the behaviour 
will be different. The sample then is made up of the same proportions of 
each group. (It is very effective in reducing the variability of sampling 
estimates when the population has known strata because it removes one 
source of variation, since the numbers out of each group are the same every 
time the sample is carried out. However, you can only use it when the sizes 
of the groups in the population are known.) 


Quota sampling: This is where the sample used is required to have specified 
proportions (the quotas) of different groups, for example of men/women, 
adults/children, different social classes, etc. The sampler has more freedom 
than in a stratified (which it resembles in certain ways), a systematic or a 
random sample since in all of these the system determines the individuals 
to be taken, and consequently there is the danger that subjective judgements 
will introduce bias in the sample. It is much, much easier to accomplish than 
the ‘proper statistical method’ it mimics, i.e. stratified, so it is possible to use 
rather larger samples (a good feature) and if experienced interviewers are 
used, and care taken in structuring the method so as to avoid predictable 
causes of bias, then it is practically useful (cheap and cheerful is fine when 
you only need a rough idea of what is happening, particularly if you would 
like to know it quickly). 

N.B. Stratified and quota sampling are both ‘representative’ samples in that 
the profile of the sample should broadly match the profile of the population 
in respect of the groupings chosen. 


Descriptions of some sampling methods 


Cluster sampling: If (and it is a big if) a population naturally subdivides 
into a number of small homogeneous groups or clusters then it can be 
convenient to select a number of clusters and sample from these clusters 

(or even examine every member). Clusters might well be generated on a 
geographical basis — and would significantly reduce the costs of collecting 

the sample information in this case. Political opinion polls use a variation 

on this, where a number of areas over the country are chosen, and a sample 
from each area is taken, with a total sample size of around 1000 people being 
used commonly. 


Self-selection: However bad any of these methods are - even poor quota 
samples — they are much better than self-selecting samples, e.g. radio or 
television ‘phone-ins’ where listeners record their ‘vote. Views of those 
who feel strongly about something are much more frequently expressed 
than those of the apathetic majority, and then the negative strong feelings 
are usually overrepresented even within this group. 


Comparisons of methods 


The choice of sampling method and size of sample to be used is essentially 
a pragmatic matter balancing cost, speed of availability of the analysis, 
and the quality of the information which can be expected. 


If stratification is well defined then stratified sampling is the ‘best approach’ 
statistically, and the size of the sample is chosen so that the different strata 
can be represented proportionally. Within strata, the sample should ideally be 
taken at random, so that the question of potential bias is not an issue. With 
other types of sampling you have to ask whether there is any likelihood that 
the sample obtained may not be representative of the population as a whole. 
‘The reasons for using other types is practical - the extra costs involved in 
stratified sampling will often mean that you can afford to take a larger sample 
size with another method and end up with better quality of information at 
less cost overall. 


Note that all of the above discussion is on the assumption that you are taking 
a small sample in comparison with the size of the population, so that you can 
neglect the effects of sampling proportion. Until you are sampling a 
significant proportion (e.g. 50%) of the population the effect of the sample 
proportion is much smaller as an effect than the sample size. 
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Summary exercise 6 


1. 


Karolina wants to choose one student at 
random from Annika, Belinda and Charlize. 
She looks at the sports results of Monday’s 
newspaper. If the first match is a home win 
she will choose Annika. If it is a draw she 
will choose Belinda and if it is an away win 
then she will choose Charlize. 

Explain why this is not a fair method of 
choosing the student. 


a) Explain what you understand by the 
term random sample. 


b) Give an example of bias in sampling. 


3. A manufacturer of rechargeable batteries 


wants to know how long they will operate for 
before needing to be charged. 


Explain why a sample should be used rather 
than a census. 


Using the random number generator on your 
calculator, take a simple random sample of 
size 10 from the table below, and calculate 
the mean of your sample. 


79 45 79 56 76 62 64 56 44 48 75 40 38 67 37 


50 77 55 35 39 61 73 39 70 36 70 41 54 35 72 


48 62 42 39 35 36 77 62 55 54 48 44 54 74 53 


63 80 80 72 63 80 80 58 63 41 61 48 39 70 63 


63 65 41 44 75 40 78 67 58 70 55 71 52 42 69 


A bag contains two 50 cent coins, a 20 cent 

coin and a 10 cent coin. Two coins are to be 

randomly chosen. 

a) Ifthe coins are marked as A, B, C and D, 
list the 12 possible outcomes of taking 
one coin and then a second coin. 


Summary exercise 6 


b) Hence find the sampling distribution of 
the mean value of the coins taken 


X is a random variable with mean 22.4 and 
variance 7.9. 


a) Find the variance of the mean of a sample 
of 20 independent observations of X. 


b) What is the minimum size of sample 
needed in order that the variance of the 
sample mean is less than 1? 


The contents of bottles of fruit juice are 
normally distributed with mean 

505 millilitres and standard deviation 
6.1 millilitres. 


a) Give the distribution of the mean content 
of a random sample of 6 bottles. 


b) Find the probability that the mean 
content of a random sample of 6 bottles 
is less than 500 millilitres. 


Packets of sweets are labelled as containing 
250 grams. The actual contents can be 
modelled by a normal distribution with 
mean 252 grams and standard deviation 
3.5 grams. 


a) Find the probability that the mean 
contents of a random sample of 10 
packets is at least 250 grams. 


b) What is the smallest size of random 
sample for which there is probability of 
under 0.5% that the mean contents is 
under 250 grams? 


EXAM-STYLE QUESTIONS 


b) To investigate if an advertising campaign 


9. A sample of size n is taken from a large : has improved attendance for hospital 
population which has a mean of grand : appointments, the attendance at the first 
— : clinic on Monday is considered. 

a) Write down the mean and variance of the : c) A production line is making batteries. 
mean of this sample. : Every 20th battery is tested in quality 
: control sampling. 
b) Under what conditions can the : 
distribution of the sample mean be : 11. Give a reason why sampling would be 
treated as a normal distribution? : needed in order to reach a conclusion about: 

10. For the following situations, give a reason a) the mean, height ofall 16-year-old males 
why the suggested sample is not appropriate: : in South Africa 
a) To investigate waiting times for a train, a : b) the mean usable life of a certain type of 

researcher goes to a station and asks the + battery. 
first 50 people she meets on the platform 
how long they have been waiting. 
Chapter summary 
e You can use a sample to gather information about a population when a census would be 
impractical. 
Samples need to be constructed carefully to avoid bias. 
A sampling frame is a list of all of the sampling units or elements (often people) in the 
population to be sampled. 
e A statistic is a function that depends only on the observed values in a sample, i.e. it must not 


use any of the unknown parameter values. 
‘The probability distribution of a statistic is known as the sampling distribution of the statistic. 
Random numbers can be used to draw a random sample from a population. 


The Central Limit Theorem says that the distribution of sample means becomes 
approximately normal as the sample size increases, no matter what the underlying 
population is: m = 30 is applied as a standard rule of thumb as to what sample size is needed 
in order to use the Central Limit Theorem. 
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Maths in real-life 


Modelling statistics 


The Golden Gate Bridge is one of the world’s 
iconic bridges as it spans 1300 metres and was 
the longest suspension bridge in the world from 
when it opened in 1937 until 1964. When it 
opened, during the first fiscal year it averaged 
just over 9000 vehicles a day, which rose steadily 
over the next 50 years to peak at nearly 120000 
vehicles a day. Since then it has reduced slightly 
but still averages over 100000 vehicles a day. The 
bridge has a moveable median barrier which is 
moved several times during the day to optimise 
traffic flows. On weekday mornings more traffic 
is heading south into San Francisco and four lanes 
are open south and two north. In the afternoons 
it is four lanes north and two south while at other 
times on weekdays and at weekends there are 
three lanes open in each direction. The bridge 
operators have now introduced congestion price 
charging in order to try to reduce the traffic loads 
at peak traffic times. 


‘The statistical challenges for building and 
maintaining bridges are considerable - the 
structural specifications given to the architects 
and engineers in building bridges are based on 
projected uses in the future. The Golden Gate 
Bridge celebrated its 80th anniversary in 2017 
and for about 45 years it has carried more than 
ten times the number of vehicles it carried when 
it opened. It is remarkable that the structure was 
designed to be strong enough to cope with such 
a large increase in usage over its lifetime. 


eS Maths in real-life 


Traffic on the Golden Gate Bridge 


140000 
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Year (July-Jun) 
People are creatures of habit, so if traffic layouts are to be changed on the 
bridge at different times of the day, there is considerable virtue in keeping 
the structure as simple as possible. But the operators have found that, while 
the morning commute is fairly predictable, the afternoon northbound 
commute is much more variable. There has also been an increase in 
southbound traffic in the afternoons, meaning that fixed times for the 
lane changes were resulting in considerable delays for southbound traffic. 
The current model is to allocate traffic lanes to accommodate northbound 
traffic without introducing unacceptable delays southbound. 
Tolls collected on the Golden Gate Bridge now exceed $100 million 
annually and about half of the tolls collected are used to subsidise other 
methods of transport (the Golden Gate Ferry and Golden Gate Transit) 
which are estimated to 
reduce the commute 
traffic by around 25%. 
Decisions about pricing 
tolls, and fares for other 
forms of transport, are 
complex, involving 
sophisticated economic 
and statistical modelling, 
as the Golden Gate 
Bridge tries to maintain 
its iconic status in its 
fourth quarter century of 
operations. 
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Estimation 


A haulage firm has to quote prices for jobs in 
which one of their major costs is the length of 
time it will take. They have to quote in advance, 
sometimes quite a long time in advance, but 
don't know how long the journey will take until 
it happens. Having some idea of what constitutes 
a typical journey - using a range of values 

rather than just a single point estimate — helps 
the company make sensible economic pricing 
decisions. 


Objectives 

After studying this chapter you should be able to: 

e Calculate unbiased estimates of the population mean and variance from a sample, using 
either raw or summarised data (only a simple understanding of the term ‘unbiased’ is 
required). 


e Determine a confidence interval for a population mean in cases where the population is 
normally distributed with known variance or where a large sample is used. 


e Determine, from a large sample, an approximate confidence interval for a population 
proportion. 


Before you start 
You should know how to: Skills check: 


1. Calculate the mean and variance of a set of 1. Calculate the mean and variance of 
data, e.g. 55689 1011 11 12 14 17 26 
Calculate the mean and variance of 
3.56'6:'8 9 02) 13.15) 15 16: 


n= 11, x =108; x? =1270 


= 18 = 9.82; Var(x) = cn -*%=19.1 


7.1 Interval estimation 


If you want to estimate the mean (1) of a population by taking a sample, 
the best you can do is to use the sample mean X. We will define more closely 
what we mean by ‘best’ in the next section. 


If you look at the following three samples from different populations (each of 
15 values), the sample means are each 44.82 and you would use that value as 
the point estimate of each of the population means. However, would you feel 
that you have the same quality of information in the three cases? 


4 ° e@ ame oe 
4 eo © o@ ee eo ° 
04 ° ° ° ° @ eo @ oo oo ° 
T T ™ Xx 
30 35 40 45 50 55 60 65 
Top middle bottom 
43.05 39.08 30.99 
44.73 40.39 33.17 
45.58 43.88 35.77 
44.89 46.15 39.33 
45.13 44.43 42.37 
45.69 38.00 42.77 
47.48 47.08 43.44 
44.67 42.17 43.86 
43.72 48.91 44.80 
44.59 43.97 47.30 
46.44 44.90 48.48 
42.64 46.99 49.15 
43.11 43.69 52.81 
45.55 54.31 54.43 
44,99 48.37 63.62 
44.82 44.82 44.82 
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The values in the top sample are very close together whereas the ones in the 
bottom sample have much more variability. Instead of just giving a point 
estimate, which loses all the extra information we can see from the spread 
of the sample values, we are going to develop interval estimates which 
capture something about how reliable the estimate is. 


We will get to the formal construction of the estimates in later sections 

of this chapter, but informally it seems reasonable that the interval estimate 
for the top sample would be the smallest width, and the estimate based on 
the bottom sample would have the largest width. 


One more thing to consider when we think of how reliable our information 
is: suppose we only had a few values rather than the 15 in each of the 
samples above. 


‘The set of 15 listed below is the middle set of data shown above and the small 
data set of 5 shows the same amount of variability as it has. Again, informally, 
the extra information from the larger sample should make you feel the estimate 
is more reliable for the larger sample - and the interval estimate for the larger 
sample will be narrower to reflect the better quality of information. 


39.08 38.00 

40.39 42.50 

43.88 46.90 

46.15 49.19 

44.43 47.50 

38.00 

47.08 7 i iss 
42.17 


48.91 o-oo 00 -@00- © @0 000 ° 


43.97 
44.90 
46.99 


43.69 36 39 42 45 48 51 54 
54.31 
48.37 


Before you learn how to construct confidence intervals formally 
(this is what the interval estimates are called), you need to meet some 
other important ideas in estimation. 


‘There is no exercise for this section. 
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7.2 Unbiased estimate of the population mean 


In Chapter 3, you saw that: 
ra ly nyt 

E(X)=E| —)_ X, |= = 

(X) (2 LX, } =i 
This says that ‘on average’ the sample mean will give the true value of the 
population mean. This is the definition of an unbiased estimator — of any parameter. 
But there are many ‘unbiased estimates’ of the mean: for example X,, X,,, 2+% 
are all also unbiased estimates of the population mean. So what else would we 
want in an estimator? If it gave estimates close to the true value more often 
it would be doing a better job as an estimator than one where the estimates 
were more widely spread. Consider the variances of the estimators listed here: 


Var(X) = var 25x, ] = 


Var(X,) = 0? 

Var(X,) = 0? 

Var( *: +%) om 
2 2 


So X is the unbiased estimator out of this group with the lowest variance. 
In fact, it is the estimator with the lowest variance out of all unbiased 


estimators — but the proof of that is beyond this course. What does it mean 

in practice? The table below shows the results of a simulation taking 

observations from a normal distribution with mean 50 and standard 

deviation 5. 

= X, +X, 
xX, xX, X, Xx, X, X, Xx, X, xX Xx, Xx, a 

53.2 | 54.0 | 54.9 | 60.2 | 51.3 | 50.8 | 46.4 | 47.4 52.3 53.2 54.0 53.6 
52.6 | 59.7 | 44.6 | 46.9 | 53.2 | 48.4 | 45.9 | 47.7 49.9 52.6 59.7 56.1 
51.1 | 49.7 | 44.6 | 53.5 | 51.5 | 55.9 | 57.2 | 50.3 517 51.1 49.7 50.4 
56.5 | 55.0 | 49.5 | 56.1 | 55.0 | 43.5 | 50.8 | 51.7 52.3 56.5 55.0 55.8 
51.3 | 51.5 | 55.4 | 48.1 | 50.5 | 41.3 | 49.6 | 55.8 50.5 51.3 51.5 51.4 
41.4 | 47.8 | 49.1 | 50.4 | 55.7 | 45.4 | 51.9 | 50.7 49.1 41.4 47.8 44.6 
55.0 | 49.8 | 39.2 | 53.9 | 47.4 | 52.3 | 54.5 | 56.2 511 55.0 49.8 52.4 
47.2 | 45.4 | 58.4 | 52.4 | 51.3 | 53.7 | 49.4 | 46.7 50.5 47.2 45.4 46.3 
46.5 | 48.6 | 54.6 | 468 | 46.4 | 44.4 | 55.6 | 46.2 48.6 46.5 48.6 47.6 
44.4 | 42.8 | 52.0 | 46.6 | 58.4 | 47.0 | 53.7 | 61.8 50.8 44.4 42.8 43.6 
40.4 | 53.9 | 45.3 | 52.9 | 45.6 | 50.6 | 50.8 | 49.4 48.6 40.4 53.9 47.2 
57.3 | 51.2 | 57.2 | 53.4 | 53.4 | 47.0 | 48.0 | 49.7 52.2 DL.3: 512 54.3 
56.5 | 47.6 | 51.8 | 45.9 | 46.0 | 53.5 | 51.7 | 46.1 49.9 56.5 47.6 52.1 
50.2 | 41.2 | 51.0 | 46.4 | 54.1 | 51.0 | 44.6 | 51.8 48.8 50.2 41.2 45.7 
42.8 | 44.8 | 58.5 | 45.9 | 53.3 | 59.0 | 49.2 | 41.8 49.4 42.8 44.8 43.8 
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From the four columns on the right of the table you can see that X returns 
estimates for the mean which consistently stay close to the true value of 
50. eine) is ‘better’ than just the single values but X is much better. 


Although it is not part of the requirements for this course it is worth 
mentioning another property of X as an estimator for the population 
mean. It is consistent — which means that as the sample size increases, 

the probability that the estimator is close to the true value also increases. 
Consistency is a property that intuitively feels desirable in an estimator: 

if you were to take a huge sample (of several million observations say), you 
would want your estimate to be really close to the true value - that’s all that 
consistency means. 


Exercise 7.2 

1. The table below shows a sample of 30 observations taken from a 
normal distribution with mean and standard deviation 10. 
Calculate the following unbiased estimators of ju: 
a) the mean of all 30 values 
b) the mean of each block of 5 values (will give 6 separate estimates) 


c) the mean of each row of 10 values (will give 3 separate estimates). 


82.1 51.9 59.9 66.2 71.5 69.0 59.1 50.5 73.2 64.4 
54.2 81.6 57.3 53.1 57.5 63.9 62.3 78.5 74.7 58.7 
57.6 68.9 61.0 67.4 45.8 53.0 70.2 48.6 81.5 46.5 


2. Repeat question 1 with the set of data shown below. 
75.6 59.4 48.5 65.3 73.7 61.8 65.7 64.1 64.3 60.1 
50.0 78.7 51.5 61.2 54.9 62.3 62.3 56.1 60.9 56.9 
67.5 57.4 60.3 52.9 59.9 76.7 56.2 71.0 48.8 49.5 


3. On the website you will find a spreadsheet file named Exercise 7_2 Q3. 
It was used to generate the data in the two questions above. 


The mean, U, of the population being sampled is actually 60 and it is 
important to realise that the best estimator will not always give the 
value closest to 60 — especially when you have multiple bites at the 
cherry, for example in taking 6 estimates each based on 5 observations. 


The spreadsheet is set up to show you the comparable estimates from 
parts a)-c) of questions 1 and 2 - without you doing the tedious work 
of calculating all the averages. Using the spreadsheet to generate a 
number of samples, explore how variable the estimators in parts a 

(based on 30 values), b (based on 5 values) and c (based on 10 values) are. 


Unbiased estimate of the population mean 


7.3 Unbiased estimate of the population variance 


We know that the variance of a random variable can be calculated as 

E{(X -1)’} and that this is equivalent to E(X)* - pr’. 

If somehow we knew what jt was before we took a sample we would be 

able to use it in a calculation to estimate the variance of the population. 

However, that situation is extremely rare and you normally have to use 

X instead of win any estimation process. Algebraically, it is possible to 

prove that the value of D(x, — k) is a minimum when k = X¥ = Lyx, 
n 

but we also know that X is not always equal to the true value ju. 


This means that 1D ~ X)’, which is the variance of the sample, will tend 


to underestimate the variance because it is using the centre of the data (+) 
rather than the population mean pL. 


ee it can be shown that : 
2 1 _(2*) ) 
8 =—P)'(x,- x -n¥)=—| Po? -- 

n— =P: ie ie =a ifs n 
isan "aioe estimator meh the population wavlanc ‘The first form of this 
expression is the way it is defined, and the second and third forms are 
computational equivalents which are often easier to use. 


‘The proof is not part of this course, but it is not too difficult and is 
included at the end of this section. 


For a set of data in ees intervals you saw in S1 Section 2.3 that the 


or Led 
Li” Sh 


rather than the full population then you need to adjust as above and use 


v= a = (+4) er =| where n=> fi 


(x, -x) 


variance was given by =—>3——— — x*. If the data are a sample 


Most calculators will 


give two standard 
deviations — normally 
labelled as o;, and 
o,_,, or as oand s. 
The first is used if you 
have a full set of data, 
i.e. it is a population, 
and the second should 
be used if you have a 
sample from a larger 
population. 
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Example 1 

The number of passengers travelling on an executive jet on a random 
sample of 10 journeys is given below. 

5;2,'8, 3).3;/5, 1,350, 4 

Find unbiased estimates of the mean and variance of the number of 
passengers on a journey in the executive jet. 


Dx = 34 ix’ = 162, 50 X= 34 = 3.4; 2 = 1162 = 3) So 516 
9 9 


are the unbiased estimates of the mean and variance. 


Example 2 
The weights in kg (to the nearest kg) of the hand luggage of a random 
sample of 10 passengers travelling on a long haul flight are given below. 
8, 7, 9, 10, 10, 8, 4, 7, 9, 10 

Find estimates of the mean and variance of the weight of hand luggage 
for passengers on the flight. 


©= 82; Dx = x= 82-82; ¢=1 
Dox = 82; Dix = 704, so = 75 =8.2; =o 


are the best available estimates of the mean and variance. 


Example 3 


The weights, xkg, of a random sample of suitcases checked in on a long 
haul flight are given below. Find estimates of the mean and standard 
deviation of the weight of suitcases checked in for the flight. 


Weight (kg) | 45<x<95 | 95<x<145 | 145<x< 19.5 |195<x<245 
Frequency, f 5 9 35 24 


The interval midpoints are 7, 12, 17 and 22, 
Y f=73; Yimf =1266; }m’ f =23272, so 


2 
71266 173, 2-1 (23270-1286 past 
B 2 


=> s=,/18.28387...=4.28 


=18.28387... 


Unbiased estimate of the population variance 


Proof that v= Yin, —x)? is an unbiased estimator of the population variance 
‘The proof makes use of a number of relationships which you have met previously: 
C= ae = E{(x = uw} =E(X*)-— ww > E(X*) =? +0" see Section 5.4 of S1 
Var(X) = 2 =E((X - w}=B(X)- > UR) =e +H 
Xa; - xy = 1x7 — nk’ see Section 2.3 of $1 (multiplied through by n) 


see Chapter 3 of $2 


Then the result for the unbiased estimator follows quite quickly: 
E{Y (x,- x)}=2{ x} —nx*} = E{yx7}- nE {x} 


=n(y? +07) - nf w2 +) = np? + no? — np? — 0? = (n—1)0” 
1 = 
So ELLY (x\-3) \=o7 


While using the mid-interval value for grouped frequency data is the best you can do, 

it is worth being aware of quite common situations where the best is really not all 

that good. When data tail off considerably at one or both ends it is common for 

unequal intervals to be used — wider intervals where not much is going on so that 

it less likely that you try to over-interpret sparse data, and narrow intervals where the 

data are dense so that you can see more detail there. The wide intervals cause considerable 
inaccuracy to be introduced in estimating the variance and standard deviation, but where 
the distribution is close to symmetrical, the estimate of the mean is likely to be reasonably 
accurate. However, if the distribution is heavily skewed, the estimate of the mean is going 

to be affected as well as the estimates of measures of spread. 


Consider the following histogram, of a data set which is a large sample of household 

incomes, in thousands of dollars. The data were grouped in intervals of widths 

20, 20, 20, 40 and 100 thousand dollars and the distribution is heavily skewed. The 

estimate of the mean income is $40 700, with an estimated standard deviation of $33 200. 
A 


Frequency density 
1 


r T T T T > 
0 20 40 60 80 100 120 140 160 180 200 


Thousands of dollars 
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If you model the data to estimate the frequencies in intervals of constant 
width to replace the top two wide intervals, it looks something like this: 


A 


Frequency density 
1 


Le 
0 20 40 60 80 100 120 140 160 180 200 * 
Thousands of dollars 


he estimate of the mean income is now $38 900, with an estimated 
tandard deviation of $29 000. 


2 


he second models the behaviour in the two larger intervals as maintaining 
the same ratio as the immediately previous intervals. The first three bars are 
identical in the two diagrams, and the shape of the tail in the second is 

what the distributions feels like it would look like if more detail was available. 
he difference in estimate of the mean is around 5%, and in the standard 
deviation it is about 14%. 


You are not expected in this course to do anything with grouped data other 
than use mid-interval values in estimating mean and variance/standard 
deviation, but when you work with your own data you should be more 
cautious when basing decisions on grouped data, especially if it is skewed, 
than you would be using estimates based on individual data values. 


Exercise 7.3 


1. For the following samples of data, calculate unbiased estimates of the 
mean and variance of the populations the samples were drawn from: 


a) | 32 | 25 | 29 | 31 | 27 | 39 | 24 | 22 | 31 


b) | 11 | 14} 12 | 18 | 11 | 7 | 12 | 18 | 21 | 17 | 16 | 12 


c) | 65 | 67 | 71 | 73 | 68 | 71 | 74 | 66 


d) | 89 | 78 | 88 | 79 | 83 | 84 | 82 | 95 | 95 | 77 | 83 


e) | 17 | 35 | 56 | 25 | 44 | 61 | 57 | 22 | 32 | 55 


Unbiased estimate of the population variance 


For the following samples of data, calculate unbiased estimates of the 
mean and variance of the populations the samples were drawn from: 


a) |x 12 | 13 | 14] 15 | 16 | 17 | 18 
Frequency | 5 9 | 14] 11] 8 3 1 


b) [x 25 | 26 | 27 | 28 | 29 | 30 | 31 
Frequency | 6 | 11 | 12 | 12 | 8 5 


For the following samples of data, calculate the best estimates of the 
mean and variance of the populations the samples were drawn from: 


a) 

Interval Osx<5 5<x<10 | 10<x<15| 15<x<20 20<x<25 | 25<x<30 
Frequency 3 9 16 14 7 | 2 
b) 


Interval O<x<10 | 10<x<20 | 20<x<25 | 25<x<30 30<x<40 | 40<x<50 


Frequency 83 121 87 79 106 | 65 


For the following summary statistics of samples of data, calculate unbiased 
estimates of the mean and variance of the populations the samples were drawn from: 


a) n=50; yx = 423; x° = 4956 
b) n= 23; ))x=835; > x? =37825 
c) n= 68; > x = 695; > x" =12457 
d) n=8; )x= 657; ) x? =75688 
e) in = 97; )x=-25;), x? =154 


The times taken by a random sample of 15 students attending a college 
to travel to college were measured. The data are summarised by 

r= 387; ¥7? = 13221. Find 

a) the variance of the times in the sample 


b) an unbiased estimate of the variance of the travel times taken by all students. 


Unbiased estimates of the mean and variance of a population, based 

ona random sample of 10 observations, are 24.3 and 5.2 respectively. 
Another random observation is obtained and is measured at 21.5. 

Find new unbiased estimates of the mean and variance of the population, 
based on the full information now available. 


Estimation 


7.4 Confidence intervals for the mean of a normal distribution 


In Section 7.1 you looked informally at the rationale for an interval estimate 
rather than a point estimate — that it allows you to give some idea of how 
good the estimator is, and that it is dependent on the variability of the 
population and of the size of the sample used. 


For any distribution, you know that the variance of X is —. The 
1 


standard deviation of the sampling distribution (=) is known as the 
n 

standard error, and in Chapter 4 you saw that the distribution of X is also 
normal if the samples come from a normal distribution, 
so P| pw -1.96-L<X<pt 1.96-F-| =0.95. 

[pte asa 
The only problem is that you do not actually know where i is, but you do 
know where the observed sample mean X is, and you can use a very neat 
piece of logic to turn this round: on 95% of occasions the sample mean 
will be captured by an interval centred on sand extending +1.96-2; 


vn 


every time X lies in that interval, js has to be captured by an interval of 
the same size centred on X. 


X¥+1.96- is the 95% confidence interval (C1) for the mean of a normal 


vn 


distribution based on a random sample of size n. 


The diagram below tries to illuminate this visually: 
: : 

outside H 
| outside 


low u high 


The bottom of the diagram shows ft — ioe jeand ps + 1.96 as the interval centred 
n 


on the true population mean pu. There are then four intervals shown, of the same width, 
centred on four possible observed values of X. For the bottom two, the value of X lies 
inside the interval centred on yu but for the top two the value of x lies outside. The bold 
black line shows where the value of ju is - and it cuts the two bottom intervals but not 
the top two intervals. 

So 95% of the time the true population mean will be captured by a confidence interval 
constructed in this way. You should talk about the true population mean lying in the 
confidence interval on 95% of occasions a confidence interval is constructed rather than 
talking about a probability — because the population mean is not a random variable. 


set Confidence intervals for the mean of a normal distribution 


The usual size of confidence interval is 95%, but it is not the only possible 
one. It is commonly used because of the shape of the normal 

distribution — the curve really starts to tail off very quickly at around 

two standard deviations from the mean. 


Technically, you can construct any size of confidence interval you choose 
to for a continuous distribution but in practical terms there are three 
others which are the most common alternatives: 90%, 98% and 99%, 


For the mean of a normal distribution based on a random sample of size n, 


¥+1.645- is the 90% confidence interval 
n 


¥+2.326-& is the 98% confidence interval 


vn 


¥ +2.576-& is the 99% confidence interval 
n 


If you have a 95% confidence interval the tails each contain 25%. 


Hh 


40 50 60 70 80 90 100 


y 


More generally if you have an &% confidence interval the tails 


100 -a@ 
each contain ane 


10 20 


100 
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Example 4 
The weights, in grams, of bags of cocoa beans are known to have a standard deviation 
of 6 grams. A random sample of bags is weighed with the following results: 

758, 748, 749, 752, 757, 760, 751, 745, 759, 761 


Calculate a 95% confidence interval for the mean weight of a bag of cocoa beans. 


yx =7540; x= Be = 754, so the 95% confidence interval is 


¥ +1.96-Z = 754 + 1.96 x & = 754 + 3.7 = (750.3, 757.7) 
hn vio 


Exercise 7.4 


1. The weights, in grams, of bags of onions are known to be normally 
distributed and have a standard deviation of 30 grams. A random 
sample of bags is weighed with the following results: 


742, 775, 712, 765, 712, 703, 727, 772, 731, 781 
Calculate a 95% confidence interval for the mean weight of a bag of onions. 
2. The volumes, in millilitres, of bottles of water are known to be normally 


distributed and have a standard deviation of 9.3 millilitres. A random 
sample of bottles are measured with the following results (in millilitres): 


605, 617, 592, 603, 621, 599, 612, 603, 609, 614, 595, 610 
i) Calculate a 95% confidence interval for the mean volume of a bottle of water. 
ii) Will a 90% confidence interval for the mean volume be wider or narrower 
than the 95% Cl in part i)? 


3. The heights, in centimetres, of a type of plant are known to be normally 
distributed and have a standard deviation of 8 centimetres. A random 
sample of plants is weighed with the following results: 


35, 38, 42, 31, 33, 45, 34, 46, 40, 36 
Calculate a 90% confidence interval for the mean height of the plants. 
4, The weights, in grams, of bowls made in a particular factory are known 


to be normally distributed and have a standard deviation of 21 grams. 
A random sample of bowls is weighed with the following results: 


195, 232, 211, 242, 201, 223, 247, 161, 226, 237 
i) Calculate a 95% confidence interval for the mean weight of the bowls. 
ii) Calculate a 98% confidence interval for the mean weight of the bowls. 


Confidence intervals for the mean of a normal distribution 


5. The heights, in centimetres, of a type of plant are known to be normally 
distributed and have a standard deviation of o centimetres. A random 
sample of 25 plants is weighed and the 95% confidence interval obtained is (43.2, 47.8). 


i) What was the sample mean, X,,? 


ii) What is the value of o, correct to 1 decimal place? 


7.5 Confidence intervals for the mean of a large sample 
from any distribution 


In Chapter 6 you met the Central Limit Theorem, which said that for large 
samples (as a rule of thumb, for n > 30) the distribution of the sample 
mean is approximately normal. In Chapter 3 you learned that the variance 
of X is always = and in this chapter you have learned that s? is an 
unbiased estimator of the population variance (and that even where 

the data are given in a frequency table for interval, if it is not unbiased, 

s’ is still the best estimator available). 


Using all of these, we can deduce that: 
For a large random sample (1 > 30) from any population, a 95% confidence 


interval is given by [: ~1.96-4, ¥ + 1.96-£) when the variance is known 


vn vn 


(quite rare) or by (= — 1.96, ¥ + 196 when the population variance 


vn vn 
has had to be estimated from the sample using s* = ty, — X) or one 
of the computational equivalents. fel 
Other size confidence intervals are obtained by replacing 1.96 in this 
expression by the corresponding z-score for the required level of confidence. 


Note that strictly speaking this is an approximate confidence interval because 
you are using the Central Limit Theorem to say that the distribution of the 
sample means is approximately normal. 


Example 5 


The weights, x, in grams, of a random sample of 50 adult hamsters were recorded. 
The following summary statistics were calculated: 

dx = 10003.3, °x° = 2.002 665 

Calculate a 95% confidence interval for the mean weight of the hamsters. 


> Continued on the next page 
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— 10003.3 
Vix =10003.3, ¥= =200.07 (2 d.p.) 


pasar 


n 


10003.3° 


= ate 002.665 — } =27.44... 
49 


=>s=5.24 
so the 95% confidence interval is 
*® + 1.96 = 200.07 + 1.96 x eau = 200.07 + 1.45 = (198.6, 201.5) 


vn 50 


It is sensible to keep extra accuracy during the working out and only round 
at the end, but don't give too much accuracy at the end as it implies a level of 
knowledge which is not justified. The width of the interval is really the thing 


to consider when deciding the appropriate accuracy ~ here oa was 
n 


+1.45 and the CI limits were calculated using both the mean and width to 2 
decimal places before rounding to 1 decimal place in the final statement of 
the Cl. Even though there are 4 signifiant figures given in the CI limits - the 
accuracy is really only 2 significant figures in the interval width. 


Example 6 

A machine is supposed to produce metal rods which are 5.7 cm long. 
A random sample of 100 rods produced by the machine are measured 
and the lengths, x cm, are summarised below. 


x 5.60$x<5.65 5.65<x<5.70| 5.70<5x<5.75 | 5.75<x<5.80 
Frequency, f 15 31 36 18 


Calculate a 95% confidence interval for the mean length of a rod produced by the machine. 


The mid-interval values, m, are 5.625, 5.675, 5.725 and 5.775 
Ne x — 570.35 _ ee 
Limf = 57035, F== > =5.7035, Yim’ f = 3253.218 


os a serr- Gal | where n= ihe 100 


570.357 


= 5 (2253.218- ) = 0.002286 
99 


=> s = 0.047808 
so the 95% confidence interval is 


F#196-*- —5.1035:4196% a 
fi 


Ti ~ Yio 


= 5.7035 + 0.00937 = (5.694, 5.713) 


Confidence intervals for the mean of a large sample from any distribution 


Exercise 7.5 


1. The summary statistics of a number of random samples are given below. 
For each one, give 


i) a95% confidence interval for the mean ii) a 90% confidence interval for the mean. 
a) n=50;))1=357; fr =12712 

b) 1 =37;) x =1635;> x? =75325 

c) n= 62; )v=651;> v? =12712 

d) n=71; ¥w=617;> w? = 27688 

e) n=97;> p=-32.4;> p? =174 


2. In Q1, does it make any difference if the populations are known to be normally distributed? 


3. A random sample of 45 cans of mixed beans gave unbiased estimates of 324.6 
grams and 6.2 grams’ for the population mean and variance respectively. 
Calculate a 95% confidence interval for the population mean weight. 


4. The times taken to complete a task were recorded for a random sample 
ot 60 workers. The sample mean was 28.3 minutes and the unbiased estimate 
of the population variance was 31.2 minutes’. Find a 90% confidence interval 
for the mean time taken to complete the task. 


5. A random sample of 75 cans of paint gave unbiased estimates of 
503.2 ml and 14.3 ml? for the population mean and variance respectively. 
Calculate a 95% confidence interval for the population mean volume. 


6. A random sample of 40 bottles of cleaning fluid gave unbiased estimates 
of 102.1 cl and 7.3 cl? for the population mean and variance respectively. 
Calculate a 98% confidence interval for the population mean volume. 


7.6 Confidence intervals for a proportion 


In Chapter 10 of S1 you met the normal approximation to the binomial distribution. 
e The binomial distribution B(n, p) can be approximated by the normal 
distribution N(np, npq) provided both np and ng = n(1 — p) are greater than 5. 
e The continuity correction must be used because a discrete distribution 
is being approximated by a continuous distribution. 
When you met the binomial distribution first, the condition that it required 
only two outcomes may have seemed to be so constraining that the binomial 
would have little practical use, until it became clear that the outcomes for any 
situation could be partitioned into two categories. These are often categorised 
as ‘success, for those you would count, and ‘failure; for those that you would 


Estimation 


not, such as ‘throw a six’ with a fair die or ‘the height of a plant is over 27 cm. 
The binomial distribution is then giving you the number of times, x, that some 
condition is seen out of n observations of the population — which gives the 


proportion of times the condition is seen (:) to be calculated. 
n 


Imagine you wanted to estimate the proportion of people who are left handed. 

If you took a random sample of 100 people and found 20 were left handed, 

your best estimate of the proportion in the population would be 20%. 

If you had taken a random sample of 500 and had found 100 were left 

handed then your (point) estimate would again be 20%, but it is based 

on much more evidence than was available in the first case and you would 

feel that the precision of your estimate should be rather better - and any 

interval estimate should be narrower for the large sample than for the smaller one. 


Formally, for the general case: p = X has approximate distribution nN ptt }, 
where q = l-p. . " 
You need to ensure that the usual conditions for using the binomial are met, 
so there needs to be a fixed number of trials, only two possible outcomes, 
with a constant probability of ‘success, and the trials must be independent 
of one another. 


Using all of these, we can deduce that: 


For a large random sample, size n, a 95% confidence interval for the 
proportion, p, with a particular property is given by 


[pa Pde 
[» 1.96], p, +1.96. a.] 


where p. is the sample proportion with the property, and q, = 1 - p.. 


Other size confidence intervals are obtained by replacing 1.96 in this 
expression by the corresponding z-score for the required level of 
confidence. 


Note that strictly speaking this is an approximate confidence interval because 
you are using the Central Limit Theorem to say that the distribution of the 
sample proportion is approximately normal, but it has a little more fuzziness 
as well — you have not applied the continuity correction and not adjusted to 
get an unbiased estimate of the variance. However, the extra precision which 
is gained by including those improvements is small and the effort required 

to make the adjustments needed to include them is large enough that it is 
usual not to do so — and you are not expected to in this course. 


Confidence intervals for a proportion 


Example 7 


A large urban area is considering introducing making two of the five lanes 

on one of the main routes into the area into high-occupancy vehicle lanes at 
peak travel times to try to ease traffic congestion. Planning officials commission 
a study to estimate the proportion of vehicles currently carrying only the driver. 


The study recorded 653 vehicles entering the area on that route between 8.10 am 
and 8.15 am on a randomly selected ordinary working day. Analysis of the video 
recording suggested that 362 of these vehicles carried only the driver. 


i) Calculate a 95% confidence interval for the proportion of vehicles 
currently carrying only the driver on that route at peak times. 


ii) If the planning officials want to collect more data, suggest how they 
might do it in order to get the best information possible for their situation. 


i) From the data, p, = ea (=55.4%), n = 653, q, = = so the 95% confidence 


interval is given by 


f -1.96,| 24 +196 [24 |= 362 _1 96, [362%291, 362 4.1 96 [362%291 | _ 516 509) 
Ee Niner ee aN ee (1653 653 7 653 6530 inser 


ii) The expense of collecting data for this situation comes in two parts - video 
recording the traffic, and then analysing the video to identify vehicles with 
only the driver. Collecting only one block of data probably keeps those costs 
to a minimum, but recording five 1-minute blocks would cost the same 
for the analysis, and allow them to see if there is any variation in proportion 
at different times. Most of the costs of the recording would be in getting the 
people and equipment there and set up — having them stay for an hour 
longer to record a number of short blocks would cost very little more. 


Exercise 7.6 


1. The sample size, n, and number of people, s, in the sample satisfying a 
particular criterion is given below for a number of surveys. Assuming 
these are random samples, calculate 


i) a95% confidence interval for the population proportion 
ii) a 90% confidence interval for the population proportion 
satisfying the criterion in each survey. 

a) n=120;s=37 

b) n=85;s=12 

c) n=62;s=25 

d) n=71;s=22 

e) n=97;s=14 
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2. A random sample of 100 bolts was measured and 12 of them were found 
to lie outside the production tolerance limits. Find a 95% confidence interval 
for the proportion of bolts which lie outside the production tolerance levels. 


3. A random sample of 80 city workers is taken and 23 of them usually travel to 
work by bus. Find a 95% confidence interval for the proportion of city workers 
who usually travel to work by bus. 


4. During the morning 300 cars were observed on a busy road and 

143 of them carried no passengers. 

i) Find a 90% confidence interval for the proportion of cars on the road 
which carry no passengers. 

ii) State any assumptions that you have had to make in constructing the 
confidence interval. 

iii) An @% confidence interval is constructed using the same sample. 
The interval has width 0.15. Find the value of a. 


5. Arandom sample of 250 items for which the prices were compared in two 
large shops found that Easipay were cheaper on 112 items than Valustore. 
i) Find a 95% confidence interval for the proportion of items for which 
Easipay are cheaper than Valustore. 
ii) Estimate the size of the sample needed for an approximate 95% 
confidence interval to have width 0.05. 


6. Arandom sample of 1 people found that 53 of them preferred to listen to 
news on the radio than in any other format. A symmetric confidence interval 
for the population proportion who prefer to listen to news on the radio 
is 0.167 < p < 0.271. 


i) Find the midpoint of this interval and use it to calculate the size of the sample. 


ii) Find the confidence level of this interval. 


Summary exercise 7 


1. For the following sample of data, calculate 2. For the following sample of data, calculate 
unbiased estimates of the mean and variance unbiased estimates of the mean and variance 
of the population: of the population: 

42 25 27 53 41 62 27 35 56 37 46 51 29 Ed 22 | 23 | 24 | 25 | 26 | 27 | 28 


Frequency | 17 | 23 | 29 | 45| 32/|19| 8 


3. For the following sample of data, calculate the best estimates of the mean and variance of the 
population: 
Interval O<x<5|5<x<10|10<x<15] 1I5<x<20 | 20<x<25|25<x< 30 
Frequency 19 37 55 67 26 8 


eZee Summary exercise 7 


For the following summary statistics of 
samples of data, calculate unbiased estimates 
of the mean and variance of the populations 
the samples were drawn from: 

a) n=90; > x=353;)) x? =4172 

b) n= 19; x =689; ))x? = 54369 


The heights, in centimetres, of a type of plant 
are known to be normally distributed and 
have a standard deviation of 11 centimetres. 
A random sample of plants is weighed with 
the following results: 

75, 68, 86, 92, 71, 83, 92, 65, 78, 82 
Calculate a 90% confidence interval for the 
mean height of the plants. 


‘The heights, in centimetres, of a type of plant 

are known to be normally distributed and 

have a standard deviation of o centimetres. 

A random sample of 35 plants is weighed 

and the 95% confidence interval obtained is 

(36.7, 44.9). 

i) What was the sample mean, ¥,,? 

ii) What is the value of o, correct to 1 
decimal place? 


The summary statistics of two random 
samples are given below. For each one, give 


i) a95% confidence interval for the mean 


ii) a 90% confidence interval for the mean. 
a) n= 130; )t = -67.5; ) =12231 
b) n= 82; }x = 952; }'x* = 14526 


The times taken to complete a puzzle were 


recorded for a random sample of 70 students. 


The sample mean was 12.3 minutes and the 
unbiased estimate of the population variance 
was 9.2 minutes”. Find a 90% confidence 
interval for the mean time taken to complete 
the puzzle. 


9. A random sample of 75 tiles was measured 
and 8 of them were found to have slight 
flaws. Find a 95% confidence interval for the 
proportion of tiles with slight flaws. 


10. A random sample of 120 commuters is taken 
and 28 of them say they would like to change 
job in order to cut down travel time. Find a 
95% confidence interval for the proportion 
of commuters would like to change job in 


order to cut down travel time. 


11. A random sample of n people found that 35 
of them did not vote in a recent referendum. 
A symmetric confidence interval for the 
population proportion who did not vote in 


the referendum is 0.187 < p < 0.383. 

i) Find the midpoint of this interval and 
use it to calculate the size of the sample. 

ii) Find the confidence level of this interval. 


: EXAM-STYLE QUESTIONS 

2. Diameters of table tennis balls are known 
to be normally distributed with mean 
and standard deviation o. A random 
sample of 125 table tennis balls was 
taken and the diameters, d cm, were 
measured. The results are summarised by 
Xa = 498.2; Yd? = 1985.78. 
i) Calculate unbiased estimates of 4 and o°. 
ii) 


iii) 


Calculate a 96% confidence interval for p. 
200 random samples of 125 balls are 
taken and a 96% confidence interval is 
calculated for each sample. How many 
of these interval would you expect not to 
B contain pl? 


Estimation 


EXAM-STYLE QUESTIONS : 15. A random sample of 150 customers of a bank 


13. 


14. 


are surveyed and 39 of them say that they are 
not satisfied with the way the bank handled 
branch closures in their local area. Find a 
95% confidence interval for the proportion 
of customers who were not satisfied. 


A survey was conducted to find the 
proportion of people who attend an aerobics 
class on a regular basis. It was found that 89 
people out of a random sample of 190 people 
attend an aerobics class on a regular basis. 


16. The bolts made by a particular machine are 
known to be normally distributed with 
standard deviation 1.2 mm. The quality 
control manager wants to check that the mean 
production level matches the setting on the 
machine to within 0.5 mm. Find the smallest 
sample size required to obtain a 99% confidence 
interval with total width less than 1 mm. 


i) Calculate a 98% confidence interval 
for the true proportion of people who 
attend an aerobics class on a regular 
basis. 


A second survey to find the proportion of 

people who attend an aerobics class on a 

regular basis was conducted in a shopping 

centre on a Wednesday afternoon. 

ii) Give a reason why this is not a : 17. The shoulder heights are measured, in 
satisfactory sample. metres, for a random sample of 43 adult 

giraffes. The heights are summarised by 

Yx=141.2,¥ x* = 465.54. 

a) Calculate unbiased estimates of the 


population mean and variance of the 
shoulder height of adult giraffes. 


‘The weights, in grams, of packets of flour 
are distributed with mean pl and standard 
deviation 15.3. A random sample of 85 
packets is taken. The mean weight of the 
sample is found to be 748 g. Calculate a 97% 


confidence interval for jt. b) Calculate a 95% confidence interval for 


the mean shoulder height. 


Chapter summary 


Some point estimates, such as X, give the best single value estimate of the location of a parameter. 
An interval estimate gives a range of values designed to give both an estimate of the location 
of the parameter and some indication of how precise or reliable that estimate is. 


" 
z > is the best unbiased estimate of the population mean UL. 
n 


as 


ist 


= Ls 3) = (Le 


the population variance, o°. 


n-1 


> 
w[Ee a) is an unbiased estimator of 


xt ee is the 95% confidence interval for the mean of a normal distribution based on a 
n 

random sample of size n. 

xt Le is the 95% confidence interval for the mean, J, of a distribution based on a large 
n 

random sample of size n (n >30). 


Chapter summary 


Hypothesis testing for discre 


distributions 


Psychology, medicine, archaeology, 
® | manufacturing processes and many 
other areas of modern life routinely use 
statistical testing to make decisions about 
what constitutes unusual behaviour or 
indicates a departure from the norm in 
a group. The consequences of making 
wrong decisions may vary from minor 
inconvenience, to financial losses in 
millions of pounds, to life and death, 
so understanding the logic of testing is 
critical for people working in these areas. 


Objectives 

After studying this chapter you should be able to: 

e Understand the nature of a hypothesis test, the difference between one-tail and two-tail tests, 
and the terms null hypothesis, alternative hypothesis, significance level, rejection region (or 
critical region), acceptance region and test statistic. 

e Formulate hypotheses and carry out a hypothesis test in the context of a single observation 
from a population which has a binomial or Poission distribution, using either direct 
evaluation of probabilities or a normal approximation, as appropriate. 

e Understand the terms Type I error and Type II error in relation to hypothesis tests. 

e@ Calculate the probabilities of making Type I and Type II errors in specific situations involving 
tests based on a normal distribution or direct evaluation of binomial or Poisson probabilities. 


Before you start 


You should know how to: Skills check: 
1. Calculate probabilities for the binomial 1. IfX ~ B(15, 0.1) find the least value of x for 
distribution, e.g. which P(X > x) < 0.1. 


If X ~ B(10, 0.3) find the largest value of x for 
which P(X < x) < 0.05. 

P(X = 0) = 0.7" = 0.028; 

P(X = 1) = 10x 0.3x0.7°=0.121 

so x = 0 is the largest 


2. IfX ~ Po(10) find the largest value of x for 
which P(X < x) < 0.025. 


2. Calculate probabilities for the Poisson 
distribution, e.g. 
If X ~ Po(7) find P(X < 3). 


pox<a)=e"(1+7+2} 00296 


8.1 The logical basis for hypothesis testing 


Anil tossed a coin five times and got only 1 head. 
He thought the coin was biased. 


Vanya tossed another coin 500 times and got 129 heads. 
She thought her coin was biased. 


Could you make a convincing argument in either case that the coin was 
biased? Or that the coin was not biased? 


You could repeat their experiments with a fair coin and see how often you 
got something like their observations. If it rarely happens then you might 
be inclined to agree the coin was biased. You might be able to do this 
practically in Anil’s case but in Vanya’s it would take a very long time. 


You could use an electronic simulation, or compare their results against 
calculated probabilities (Vanya’s would need the normal approximation). 


A critical principle is to look for the likelihood of seeing at least as extreme 
a result as the observed. Cases where none of the 5 tosses give a head would 
be taken as well as those which give 1 head. 

The second critical principle is to also include the cases of 0 or 1 tail (5 or 

4 heads) as being ‘at least as extreme’ unless there was some reason to believe 
before Anil tossed the coin that the coin would show more heads than tails 
if it was biased. 

It is easier to see situations where it might apply if you were looking at, say, 
the proportion of faulty items on a production line. You would only be 
interested in whether there had been an increase in the proportion. 

‘As bad or worse’ is another way to express the idea of ‘at least as extreme a result’ 


For Anil’s case you can work out the probability distribution exactly: 


Numberofheads| 0 | 1 | 2 3 i 4 | 
Probability _| 0.03125 | 0.15625 | 0.3125 | 0.3125 | 0.15625 | 0.03125 


The likelihood of getting 0, 1, 4 or 5 heads is 0.375 or 2. 
This is less than half, but more than one third. You would not refer to 
something which happens more often than 1 in 3 times as a ‘rare event. 


Similarly, it would not be reasonable to try to argue that this outcome 
provided positive support for the proposition that the coin was fair. 
If you tossed a coin which came up heads only 30% of the time then 
1 head in 5 tosses would be very common. 


ett The logical basis for hypothesis testing 


The best you can say is that Anil’s result does not provide convincing evidence 
that the coin is biased. 


Vanya's observation in her experiment actually has a higher proportion 

of heads than Anil’s. However, with a large number of trials, the proportion 
seen is much more likely to be close to the probability and the likelihood of 
seeing Vanya’s result with a fair coin is almost vanishingly small. 


What happens if the situation is not as clear cut as either of these? 


Generally, the null hypothesis H, is the probability basis under which 
you consider a test. 


For the coin tossing example the null hypothesis would be H, : p = 4. 


he alternative hypothesis H, : p # 4 is what you would turn to if you felt 


there was convincing evidence that H, was not true. 


‘The principle of hypothesis testing is to compare where the value of an 
observed statistic (the test statistic) lies within the sampling distribution 
under the null hypothesis. 


it would be classed as a ‘rare event’ (and the significance level of a test 
is whatever level we use to judge ‘rare’) then you look for an alternative 
explanation. 


the event observed is not an unusual occurrence when the null hypothesis 
is true, it will also not be unusual for a lot of other situations, so accepting the 
null hypothesis is a very, very weak statement; there is no positive supporting 
evidence that it is true. 


n many situations, you can determine the significance level of a test exactly 
(= the proportion to be classed as ‘rare’). 


5% is the standard level. 


f you decrease the significance level, then you are less likely to reject the 
null hypothesis incorrectly (i.e. when it is true) — but you will automatically 
be more likely to accept the null hypothesis when it is not true. 


The 5% standard level is used to give the best trade-off between the two types 
of error. If it is more important that you don't reject H, wrongly you can use 
a 1% significance level. If it is more important not to accept H, wrongly you 
can use a significance level above 5%. 


In this chapter you will be testing only a discrete distribution (the binomial 
distribution or Poisson distribution) and in Chapter 9 you will look at 
testing the mean of a distribution using the normal. 


Hypothesis testing for discrete distributions 


For discrete cases taking only integer values, the decision rule may be in the 
form ‘Reject H, if X < 1, or if X <2... Testing with n = 22, for p = 0.25 against 
p <0.25, the probability of these events happening if the null hypothesis is true 
are 1.5% and 6.1%. 


In this case you cannot have a 5% test - you could have a 1.5% test or a 6.1% test. 


For this examination the size of the test stated is the maximum size which 
can be used ~ so the decision rule here will be ‘Reject H, if X < l’ and the 
actual size of the test will be 1.5%. 


To construct a test (critical region approach): 


e Identify formally the null and alternative hypotheses. 
o The null hypothesis must give a distribution for the test statistic. 
In this context it is always in the form 
H):p=.. or H:A=... 
o The alternative hypothesis is in one of the forms: 
H,:p>...orp<... orp #... (or similar for the Poisson) 
e@ Give an explicit statement of the decision rule to be used 
(including the significance level you are using). 


e Apply the decision rule to the observed value of the test statistic. 
Avoid categorical statements in your conclusion statements. 
A test allows you to state either 
o Accept H,. 
This is not because the evidence supports it, and you should not 
say this. You can only conclude there is not significant evidence 
against H,. 
0 Reject H,. 
Conclude that there is evidence, significant at your chosen 
significance level, that the mean or probability is not ... (or evidence 
of an increase/decrease in mean or probability if it is one-tailed). 
@ Give the conclusions of the test in the context of the problem. 


If you are told that the sample was discovered not to be random, then the 
conclusion of your test is invalidated. 


eit The logical basis for hypothesis testing 


Example 1 


The proportion of applications from women to a particular course has 

been stable at 35% over a number of years. The course is revised. 

A random sample of 50 applications is taken to see if the proportion of 

women applying for the new course has changed. 

a) Give the null and alternative hypotheses you would use in the test. 

b) In fact 31 out of the 50 applications in the above test were from women, 
and the null hypothesis was rejected at the 5% level of significance. 
State the conclusion in context. 


a) H,:p =0.35 vs H, : p #0.35. You are only looking at whether the 
proportion has changed — the revision might make it more or less 
attractive to women. 

b) Reject H, and conclude there is sufficient evidence at the 5% level to 
suggest that the proportion of applications from women is not 35%. 


Example 2 


The number of motorbikes going over the speed limit on a road in a housing 

estate in a 10-minute period between 4 pm and 7 pm may be modelled 

by a Poisson random variable with parameter 2.2. 

Traffic humps are installed on the road as a traffic calming measure. 

a) Give the null and alternative hypotheses you would use in a test to see if 
the number of motorbikes breaking the speed limit on the road had 
been reduced. 

b) Ina randomly selected 10-minute period between 4 pm and 7 pm 
after the change was made there were two motorbikes breaking the speed 
limit, and the null hypothesis was not rejected at the 5% level of significance. 
State the conclusion in context. 

a) H,:A=2.2 vs H,:A<2.2. You are only looking at whether the 
average rate had been reduced. 

b) Accept H, and conclude there is not sufficient evidence at the 5% level 
to suggest that the average number of motorbikes breaking the speed 
limit in a 10-minute period has been reduced. 


When you do a hypothesis test, there are two possible types of error. 


Type I error: 
he null hypothesis is actually true and you reject it. 


TI 
This is the significance level of the test. 
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Type I] error: 

You accept the null hypothesis and it was not true. 

The likelihood of this happening depends on what the true value of the 
parameter is. 


Exercise 8.1 


1. The proportion of primary school teachers who have passed A Level 
Maths is 23%. The government launches a campaign to encourage 
people with A Level Maths to consider teaching in primary schools. 


A random sample of 40 applications for training is taken to see if the 
proportion of people with A Level Maths has increased. Give the null 
and alternative hypotheses you would use in the test. 


2. 13 out of the 40 applications in the test in question 1 were from people with 
A Level Maths, and the null hypothesis was not rejected at the 5% level 
of significance. 

State the conclusion in context. 


3. The number of accidents on a particular stretch of road can be modelled 
by a Poisson distribution with a mean of 0.45 accidents per day on weekdays. 
Give the null and alternative hypotheses you would use in a test to see if 
the number of accidents on that stretch of road is different at the weekend. 


4. Over a month which has 8 weekend days in it there were 7 accidents 
recorded on the stretch of road in question 3, and the null hypothesis 
was rejected at the 5% level of significance. 

State the conclusion in context. 


8.2 Critical region 


In all cases the null hypothesis must provide a probability distribution for 
the test statistic. 


‘The basis of all tests is to see if the statistic calculated from the observations 
lies in the most extreme part of this distribution — called the critical region, 
or the rejection region. 

You reject H, if the observed value lies in this region. All other values lie in the 
acceptance region of the test. The probability of this happening when H, is 
actually true is called the size of the test. You use 5% as the common default size. 


A good test is one where the probabilities of both possible errors are as small 
as possible. For a fixed sample, however, you can only reduce the probability 
of a Type I error by widening the net to include cases previously in the critical 
region. This immediately reduces the chance of rejecting H, when it is true 
but also increases the probability of the Type II error. 


Critical region 


A smaller significance level means that you are: 


e@ less likely to reject H, incorrectly 


e more likely to accept H, when it is not true. 
A larger significance level means that you are: 


e more likely to reject H, incorrectly 


e less likely to accept H, when it is not true. 


Consider the case where the null hypothesis is that A = 7.5. 
The sampling distribution looks like this. 


Poisson, mean = 7.5 


0123 4 5 6 7 8 9 10 11 12 13 14 15 16 
Value of x 


If you wanted to test that the parameter A was different from 7.5, then very 
low or very high observed values would be considered ‘extreme’ and would 
push you away from the null hypothesis. This would be a two-tail test, 
because values in both tails of the distribution would be considered extreme. 


Poisson, mean = 7.5 
0.16 
0.14 
x 0.12 
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At the 5% level of significance, the two-tail test will reject H, if X is 0, 1 or 2 
or if it is 13 or more. There is a probability of 2.0% at the bottom end and 
another 2.2% at the top end, but this is as close to 5% as it is possible to get. 
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Ifyou had some reason to expect that A would only be lower than 7.5, 
then only very low observed values would be considered ‘extreme. 
This would be a one-tail test. 


Poisson, mean = 7.5 
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he one-tail test at the lower end has a critical region of X being 0, 1 or 2, 
which has a probability of 2.0%. If the critical region also included 3 then 
he probability of seeing a value in the critical region when H, was true 
would be 5.9%. 

here is nothing in between a 2.0% test and a 5.9% test because the Poisson 
is a discrete distribution. For this examination, you would use 2% 

(because if a 5% test is specified you must not exceed 5%), but outside of 
this examination a strong case could be made to use the 5.9% test as closer 
to the desired level. 


Poisson, mean = 7.5 


0o123 45 6 7 8 9 10 11 12 
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The one-tail test at the upper end has a critical region of X being 12 or more, 
which has a probability of 4.3%. If the critical region included 11 as well then 
the probability of seeing a value in the critical region when H, was true 
would be 7.9%. 


heya Critical region 


Consider the B(15, 0.5) distribution. 
The table shows both the individual probabilities and the cumulative 
probabilities, and the graph shows the shape of the distribution. 


x |P(X=x) |P(X<x) 

0 | 0.00003 0.00003 

1__| 0.00046 0.00049 

2 | 0.00320 0.00369 Binomial n= 15, p = 0.5 

3__| 0.01389 0.01758 0.30 

4 | 0.04166 0.05923 0.25 

5 [0.09164 | 0.15088 0.20 

6 {0.15274 | 0.30362 ie 

7_ | 0.19638 0.50000 g ota 

8 | 0.19638 0.69638 . 

9 {0.15274 | 0.84912 oe i | | 

10__| 0.09164 0.94077 0.00 ++— +L Tt fo, T 
11 | 0.04166  |0.98242 0 12 3 4 5 6 7 8 9 10 11 12 13 14 15 
12_| 0.01389 _| 0.99631 * 

13__| 0.00320 __| 0.99951 

14_| 0.00046 _| 0.99997 

15__| 0.00003 1.00000 


While the individual probabilities are helpful in understanding the shape 
of the distribution and getting a sense of where the extreme values will be, 
it is the table of cumulative values which gives you the critical region easily. 


For a 5% two-tail test you want to have 2.5% at each end. 0-3 have a total 
probability of 1.76% and 0-4 have probability 5.92%, so the critical region 
would include 0-3 at the bottom. 


At the top end, you are comparing the cumulative probabilities with a 
target of 0.975 to cut off 2.5%, so the rest of the critical region will be 
12-15, and you need to be very careful here. 


The cumulative probability in the highlighted cell (for x = 11) is the closest 
to the target of 0.975, but the first value in the critical region at the top end 
is 12 since P(x > 11) > 0.025. 


A 5% one-tail test for a decrease in p would have critical region containing 
just 0-3 (with an exact significance level of 1.8%) because 0-4 would be 
greater than 5%. This shows one of the inescapable difficulties with testing 
discrete distributions with small numbers of likely values - you may not be 
able to get close to the desired significance level. 
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This table and graph show the B(15, 0.2) distribution. 


P(X =x) P(X <x) 
0.03518 
0.13194 016713 
0.23090 0.3980 

0.25014 0.64816 
0.18760 0.83577 
0.10318 0.93895 
0.98194 
0.99576 
0.99922 
0.99989 
0.99999 
1.00000 
1.00000 


Binomial n = 15, p = 0.2 
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The only sensible 5% two-tail test would have a critical region of 0 at the 
bottom (with probability of 3.5%) and >6 (i.e. 7 - 15) at the top (with 
probability 1 - 0.98194 = 1.8%). It would not make sense to not be 

able to reject any low values at all in a two-tail test yet the only option 
gives too high a probability of a Type I error. In this case you would 

not be able to construct a test (and you would not be asked to in the 
context of this course). 


Remember that for binomial distributions with large n and Poisson 
distributions with large values of A you can approximate the exact 
distribution with a normal distribution - which will allow tests to be 

carried out without burdensome calculations. You will cover this in Chapter 9. 


Alternative method of carrying out a test 
(probability approach) 


You need to be familiar with critical (rejection) and acceptance regions, 

but there is a second approach possible in carrying out a hypothesis test. 
That is to work out the probability of seeing a result ‘as least as unusual 

(far from the null hypothesis parameter value)’ as the one observed. 

Worked examples in later sections of this chapter will show both approaches. 


kim Critical region 


Exercise 8.2 


1. Fora binomial with n = 10, testing H,: p = 0.5 against H,: p # 0.5 with 
a 5% level of significance. 


2. Fora binomial with 1 = 10, testing H,: p = 0.5 against H, : p < 0.5 
with a 5% level of significance. 


3. For a binomial with n = 10, testing H,: p = 0.5 against H,: p # 0.5 with 
a 2% level of significance. 


4. Fora binomial with n = 12, testing H,: p = 0.3 against H,: p # 0.3 with 
a 5% level of significance. 


5. Fora binomial with n = 12, testing H,: p = 0.3 against H,: p < 0.3 
with a 5% level of significance. 


6. Fora binomial with n = 15, testing H,: p = 0.8 against H,: p 40.8 
with a 5% level of significance. 


7. For a binomial with n = 15, testing H,: p = 0.8 against H,: p > 0.8 
with a 5% level of significance. 


For a Poisson, testing H,: A = 2 against H,: A < 2 with a 5% level of significance. 

For a Poisson, testing H,: A = 2 against H,: A # 2 with a 5% level of significance. 
10. For a Poisson, testing H,: A = 8.5 against H, : A < 8.5 with a 5% level of significance. 
11. For a Poisson, testing H,: A = 8.5 against H,: A > 8.5 with a 5% level of significance. 
12. For a Poisson, testing H,: A = 8.5 against H,: A # 8.5 with a 10% level of significance. 


8.3 Type I and Type II errors 


In conducting a hypothesis test you end up deciding either to accept the 
null hypothesis or to reject it. In either case it is possible to be incorrect — 
giving rise to the two types of error identified at the end of Section 8.1. 


Type I error: 
The null hypothesis is actually true and you reject it. 
This is the significance level of the test. 


Type I error: 

You accept the null hypothesis and it was not true. 

The likelihood of this happening depends on what the true value of the 
parameter is. 
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5% tests are the standard default for the significance level, because they usually 
give the best balance between the two possible errors. However, there are 
occasions where reducing the size of Type I or Type II errors is important. 


You will be given a particular level of significance in any case where you 
are expected to use something other than 5%. 


Some diagrams will help to illuminate what happens with Type II errors - 
remember that a test produces a critical region in which the null hypothesis 
is rejected, and all other outcomes are in the acceptance region. If the null 
hypothesis is not true, then p in the binomial or A in the Poisson take a 
value other than that specified in the null hypothesis, and P (Type II error) 
will be the probability of an observation falling in the acceptance region. 
Consider the two-tail test in the last section with A = 7.5 - the purple bars 
represent outcomes in the critical (rejection) region and the green bars 
represent outcomes in the acceptance region. 
Poisson, mean = 7.5 
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And consider what the distribution looks like with A = 5, keeping the colours 


for acceptance and critical regions the same: 
Poisson, mean = 5 


Git Xt ae 11 12 13 14 15 16 
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Even though 5 feels a long way off the null value of 7.5, you can see how 
much of the distribution still falls within the acceptance region — leading 
to an incorrect conclusion. P(Type I error) = 0.733. 


In the last section you saw that a one-tail test for the binomial with n = 15, 
with H,: p = 0.5 vs H,: p < 0.5, has critical region 0-3, giving the size of the 
test as 1.8% (which is the probability of a Type I error). The distribution 
under H, showing the critical region in purple is shown below. 
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If you now consider a specific case of p = 0.2 (so only one success in five 
trials rather than one in two) you might feel that is a pretty big shift and 
that the test should identify it. In this case P(‘Type II error) is 35.1%. 


Binomial n = 15, p = 0.2 
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Moreover, as you move closer to the null hypothesis value, the 
probability of a Type II error gets larger very quickly. 
With p = 0.3, P(Type II error) is over 70%. 
Binomial n = 15, p= 0.3 
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Example 3 
A test is being carried out on a binomial distribution with n = 15, for H,: p = 0.3, 
with a critical region of 0 and 1. 


i) Calculate the significance level of the test. 
ii) State the probability that a Type I error occurs. 


iii) If in fact he calculate the probability ofa Type Il error. 


i) If X ~ B(L5, 0.3) with critical region 0 and | then the significance level is 
P(X = 0,1) = 0.7 + 15 x 0.7" x 0.3 = 0.00475 + 0.03052 = 0.035. 

ii) The probability that a Type I error occurs = the significance level 
(by definition) so it is 3.5%. 

iii) If X ~ B(15, 0.1) then the probability of a Type II error is the probability 
X is observed in the acceptance region, i.e. P(X > 1): 


P(X > 1)=1- (0.9% + 15 x 0.9" x 0.1) 
= 1 - (0.205 89 + 0.343 15) = 0.451 or 45.1% 


Example 4 
A test is being carried out on a Poisson distribution H,: A = 4.5 against 
H,:A< 4.5 with a critical region of 0. 
a) Calculate the significance level of the test. 
b) When the test is carried out, the value observed is 3. 
i) State the conclusion of the test. 
ii) State which type of error in the hypothesis test (Type I or Type II) 


a) If X ~ Po(4.5) then P(X = 0) i 1.1% so the significance 
level is 1.1% (the probability of being in the critical (rejection) region 
when H, is true). 

b) i) Since 3 does not lie in the critical region, the decision is to accept 

H, and conclude that there is insufficient evidence to suggest that 
the mean is less than 4.5. 


ii) Since the decision is to accept H, the only error possible is the 


Type II where you incorrectly accept the null hypothesis. 


Type | and Type Il errors 


Exercise 8.3 


1. A test is being carried out on a binomial distribution with n = 15, for 
H,: p = 0.2, with a critical region of 0 and 1. 


i) Calculate the significance level of the test. 
ii) State the probability that a Type I error occurs. 
iii) If in fact p = 0.1, calculate the probability of a Type II error. 
2. A test is being carried out on a Poisson distribution H,: A = 6 against 
H,:A <6. A 2% test is to be carried out. 
i) Find the critical region for the test. 
ii) Calculate the actual significance level of the test. 
iii) State the probability that a Type I error occurs. 
iv) Ifin fact A = 3, calculate the probability of a Type II error. 
3. A test is being carried out on a binomial distribution with n = 20, for 
H,: p = 0.3, with a critical region of 0, 1 and 2. 
i) Calculate the significance level of the test. 
ii) State the probability that a Type I error occurs. 
iii) If in fact p = 0.1, calculate the probability of a Type II error. 
4. A test is being carried out on a Poisson distribution H,: A = 0.8 against 
H,: A> 0.8. A 5% test is to be carried out. 
a) i) Find the critical region for the test. 
ii) Calculate the probability that a Type I error occurs. 
b) When the test is carried out, the value observed is 3. 
i) State the conclusion of the test. 
ii) State which type of error in the hypothesis test (Type I or Type II) 
could have been made in these circumstances. 
5. A test is being carried out on a Poisson distribution H,: A = 3 against H,: A < 3. 
i) Show that the critical region for a 5% test contains only 0. 
ii) Show that the probability of a Type I] error for A = 0.3 is greater than 25%. 
6. A test is being carried out on a binomial distribution with n = 20, 
for H,: p = 0.6, against H,: p > 0.6. 
a) i) Show that the critical region for a 5% test is X > 17. 
ii) State the probability that a Type I error occurs. 


b) Ifthe proportion observed in the test was 75%, state which type of 
error in the hypothesis test (Type I or Type II) could have been made. 
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8.4 Hypothesis test for the proportion p of 
a binomial distribution 


Sections 8.4 and 8.5 look at doing the full hypothesis test procedure for the 
binomial and the Poisson distributions. This means identifying formally the 
null and alternative hypotheses, constructing the test as in Section 8.2, and 
then applying the decision rule and stating the conclusion. In some cases 
you will also be asked about errors. 


The conclusions will be in one of these forms: 


e Reject H,, and conclude that there is significant evidence 
(at the ...% level) that p is greater than ... Poe 
(or whatever the appropriate context is). 
e Accept H,, and conclude that there is not significant evidence 
that p is greater than ... 
‘This is a lack of evidence against the hypothesis, not evidence in 


support of it, and your language must reflect this. 


Example 5 


30% of customers in a large store present the store’s loyalty 
card when they buy something in the store. 

‘The store runs an advertising campaign to promote their 
loyalty card. After the advertising campaign has finished, 

a random sample of 20 sales is examined and a loyalty card 
was used in 10 of them. 

Test at the 5% level of significance whether the campaign has 
been effective in increasing the use of the loyalty card. 


Let X = the number of sales where a loyalty card is used (in the random sample). 
Then X ~ B(20, p), where p is the proportion using loyalty cards now. 
This is a one-tail test since it is looking for p to be 

increased from 0.3. 

The formal statement of the hypotheses is 

H,: p = 0.3 H,: p> 0.3 

Critical region method: 

‘The closest you can get to a 5% test is a 4.8% test where the critical 
region is X= 10. 

‘There were actually 10 cards used, so the decision is to reject H,. 
‘There is significant evidence (at the 5% level) to suggest that the 
advertising campaign has increased the use of the store loyalty card. 


Probability method: 
P(X = 10) = 4.8%. This is <5%, so reject H, (and state the conclusion as before). 
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‘The value of 4.8% is known as the p-value of the observations - the 
probability of observing an outcome as least as extreme as what has been 
seen, if H, was true. 

Most academic research results are reported quoting p-values, but they 
are not included in this course. 


Example 6 


A psychologist moves to New Zealand after working in the UK. 

In the UK 25% of her patients suffered from a fear of heights. 

Ina random sample of 20 of her patients in New Zealand she finds that there 
are 7 who suffer from a fear of heights. 

Test at the 5% level of significance whether there is a difference in proportions 
of people who suffer from a fear of heights in the UK and in New Zealand. 


X ~ B(20, p) where p is the proportion of patients in New Zealand with 
a fear of heights. 

The psychologist has no reason to suppose that the proportion 

should be higher or lower in New Zealand than in the UK. 

This should be a two-tail test looking for a difference. 

The formal statement of the hypotheses is 

Hy: p = 0.25 H,:p # 0.25 

Probability method: 

P(X 27) = 10.2% > 2.5%, so accept H, and conclude that there is 

not sufficient evidence at the 5% level to suggest the proportion in 
New Zealand is different from in the UK. 

Critical region method: 

P(X < 1) = 2.4%; P(X = 10) = 1.4% so the decision is to reject H, if X is 0, 1 or 
at least 10. 

‘The observed outcome was 7, which does not lie in the critical region, 
so accept H, (and state the same conclusion). 


Note that you can be asked explicitly to construct a critical region, so you 
must be able to conduct tests using that method. In cases like Example 6 
where you are conducting a two-tail test, there is less work involved in 
using the probability method if you are only asked to carry out the test for 
a discrete distribution — because you only need to investigate one tail of the 
distribution — but you must be careful to show that you are comparing the 
probability with half the size of the test. 
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Exercise 8.4 


1. 


Ina multiple choice examination paper in Psychology, a candidate has 
to select which one of five possible answers to a question is correct. 

On a paper of 15 questions she gets 6 correct answers. 

Her teacher says she can’t have done any work at all. Test, at the 5% 
significance level, whether there is evidence to suggest that she has done 
better than just guessing. 


A coin is tossed 12 times, and 10 heads are seen. 
i) Test at the 5% significance level whether the coin is biased. 
ii) State which type of error in the hypothesis test (Type I or Type II) 


could have been made in these circumstances. 


An airline claims that at least 96% of their flights arrive within 5 minutes 

of their published arrival time. In a random sample of 30 flights three arrive 
more than 5 minutes after the published arrival time. Test the airline's claim 
at the 10% level of significance. 


A train company claims that at least 90% of their trains arrive on time. 
In a random sample of 35 train arrivals, three arrive late. Explain why 
you would not need to do any detailed calculations in order to conclude 
this sample would not provide evidence to suggest that the company 
overstates their performance. 


A financial advisor’s publicity claims that at least 90% of their customers 
are satisfied with their performance. 


i) Isa5% ora 10% test more likely to conclude that the financial advisor 
has overstated their performance? 


Ina random sample of 20 customers 4 said they were not satisfied. 


ii) Carry out a 5% test to investigate whether the financial advisor has 
overstated their performance. 


iii) Could a Type II error have been made in this situation? 


In a hotel 25% of people take longer than 10 minutes to register on arrival. 
The management install a new computer system which they claim will 
reduce the time to register. It is decided to accept the claim if, in a random 
sample of 24 people, the number taking longer than 10 minutes to register 
is not more than 2. 


i) Calculate the significance level of the test. 
ii) State the probability that a Type I error occurs. 


iii) Calculate the probability that a Type I] error occurs if the probability 
of a person taking longer than 10 minutes to register is now 10%. 
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8.5 Hypothesis test for the mean of a Poisson 
distribution 


You use the same process for testing the Poisson distribution: 


e identify the hypotheses formally and define the critical region; carry out 
the test; state conclusions 


e OR use the probability approach. 


Example 7 

A single observation x is to be taken from a Poisson distribution with 
parameter A. 

This observation is to be used to test H,: A = 6 against H,: A #6. 


i) Using a 5% significance level, find the critical region for this test 
assuming that the probability of rejection in either tail is to be no 
more than 2.5%. 

ii) Write down the actual significance level of this test. 

‘The actual value of x obtained was 2. 

iii) State a conclusion that can be drawn based on this value. 


i) P(X <1) =1.7% at the bottom end and P(X = 12) = 2.0% at the 
top end, so the decision rule is: Reject H, if X < l or X 2 12. 

ii) The size of the test is 1.7% + 2.0% = 3.7% 

iii) Since 2 is not in the critical region, you should accept H, and 
conclude that there is not sufficient evidence at the 5% level to 
suggest that the mean is not 6. 


Example 8 

The number of accidents on a stretch of road may be modelled by a Poisson 

distribution with an average rate of 2.5 accidents per week. 

a) Find the probability that there are at least 5 accidents in a randomly 

chosen week. 

Some new warning notices are installed and in a 3-week period there 

are 4 accidents. 

b) Using a 5% significance level, test whether there has been a reduction 
in the average rate of occurrence of accidents. 
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a) ifX~ P(2.5), P(X>5)=1-P(X<4 
b) Test H,: A= 7.5 against H,: A <7.5 
(for the 3-week period the mean is 3 x 2.5 = 7.5 under H,). 


0.8912 = 0.1088 


Probability approach: 

P(X < 4) = 0.132 > 5%, so accept H, and conclude there is not sufficient 
evidence to suggest that there has been a reduction in the average rate 
of accidents occurring. 


Critical region: 

P(X = 0) = 0.0005; P(X < 1) = 0.0047; P(X < 2) = 0.0203; P(X <3) = 0.059 > 5% 
So the critical region contains 0, 1 and 2. Since 4 does not lie in this region H, 
would be accepted as before. 


Example 9 


Luigi was recently appointed to be responsible for the service in a restaurant. 
During the previous year, the restaurant received an average of 3 emails per week 
complaining about the quality of service in the restaurant. 

The number of such emails may be modelled by a Poisson distribution. 

a) During the week before Luigi’s appointment, 6 such emails were received. 
Examine, at the 5% level of significance, whether there is significant 
evidence that, immediately before Luigi’s appointment, the mean number 
of such emails received exceeded 3 per week. 

b) On his appointment, Luigi introduced changes to the methods of waiters 
recording orders and passing them to the kitchen. Following these changes, 
2 emails of complaint were received during a two-week period. 

Examine, using a 5% level of significance, whether there is significant 
evidence that, following the changes introduced by Luigi, the mean 
number of such emails received was less than 3 per week. 

c) Comment on the effectiveness of the changes introduced by Luigi. 

a) IfX isthe number of emails received in a week then X ~ Po(A), and the 
hypotheses are H,: A = 3; H,:A > 3. 

P(X 26) =1-P(X<5) =1 - 0.916 = 0.084 > 5%, so accept H, and 
conclude there is not sufficient evidence to suggest that the mean is 
more than 3 per week. 


> Continued on the next page 
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b) Now the hypotheses are H,: A = 3; H,: A < 3. If Y is the number of emails 
received in two weeks then Y ~ P(6), P(Y < 2) = 0.062 > 5%, so accept H, 
and conclude there is not sufficient evidence to suggest that the mean is 
less than 3 per week. 

The number of complaints has gone from 6 in a week to only 2 in a two- 
week period, which looks as though the changes might have had an 
effect. However, the tests on the mean being higher than 3 per week 


before and being lower than 3 per week after did not provide sufficient 
evidence in either case, highlighting that accepting the null hypothesis in 
a test is a fairly weak statement - the problem is that testing in only one 
or two weeks makes it very difficult to identify a shift in the mean rate. If 
Luigi monitored the complaints over a period of two months he would 
have a much better evidence base on which to judge the effects of the 
changes. 


Exercise 8.5 


1. A single observation x is to be taken from a Poisson distribution with 
parameter A. 
This observation is to be used to test H,: A= 9.5 against H,: A < 9.5. 


i) Using a 5% significance level, find the critical region for this test. 
ii) Write down the significance level of this test. 
The actual value of x obtained was 4. 
iii) State a conclusion that can be drawn based on this value. 
2. ‘Hits’ on a webpage at a particular location on the Internet occur at an 
average rate of one every 3 minutes, between the times 11 am and 4 pm. 
i) IfX is the number of hits observed in a half-hour period between the 
times 11 am and 4 pm, which distribution is appropriate to model 
this situation and why? 

ii) Calculate the probability that there are no more than 4 hits in this 
half hour. 

iii) Why might the probability that there are no more than 4 hits in the 
half hour 3 am to 3.30 am be different from the answer in ii). 

iv) If there were 4 hits between 3 am and 3.30 am, test at the 5% level 
of significance that there is a difference in the average rate of 
occurrences between the periods 11 am to 4 pm and 3 am to 3.30 am. 
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3. Accidents occur at a certain road junction at the average of 3 per month. 

a) Suggest a suitable model for the number of accidents at the junction 
in the next month. 

b) Find the probability of 5 or more accidents at this road junction in 
the next month. 

The local residents have applied for a crossing to be installed. The 

planning committee agree to monitor the situation for the next 12 

months. If there is at least one month with 5 or more accidents in it they 

will install a crossing. 

c) Find the probability that a crossing is installed. 

d) Ifacrossing is installed, and the local residents want to see if there 
is any improvement in the situation, give the decision rules which 
would apply if the test was to look at the number of accidents in 
i) a1-month period ii) a 3-month period. 

e) Give one advantage of choosing the 
i) 1-month test period ii) 3-month test period. 

4. A single observation x is to be taken from a Poisson distribution with 

parameter A. 

‘This observation is to be used to test H, : A = 2.5 against H, : A > 2.5. 

i) Using a 5% significance level, find the critical region for this test 

ii) Write down the significance level of this test. 

The actual value of x obtained was 5. 

iii) State a conclusion that can be drawn based on this value. 

5. Customers arrive at a store at an average rate of one every 4 minutes, 
between the times 9 am and 4 pm. 

i) if X is the number of customers in a 10-minute period between the 
times 9 am and 4 pm, which distribution is appropriate to model this 
situation and why? 

ii) Calculate the probability that no more than 4 customers in this 
10-minute period. 

iii) Why might the probability that there are no more than 4 customers 
in the 10 minutes just before the store closes be different from the 
answer in ii)? 

iv) If5 customers arrive in the 10 minutes just before the store closes, 
test at the 5% level of significance that there is a difference in the 
average rate of occurrences between the period 9 am to 4 pm and just 
before it closes. 
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6. Accidents occur on a certain section of a train route at an average of 2.2 


1. 


per month. 


a) Suggest a suitable model for the number of accidents on this section 


in the next month. 


b) Find the probability of 5 or more accidents on this section in the next 


month. 


The local residents have applied for a speed limit to be imposed on 
this route. The company agree to monitor the situation for the next 12 
months. If there is at least one month with 5 or more accidents in it they 


will impose a speed limit on the section. 


c) Find the probability that a speed limit is imposed. 


d) Ifaspeed limit is imposed, and the local residents want to see if there 


is any improvement in the situation, 


i) explain why it is not possible to have a test at the 5% level based 


on one month 


ii) give the decision rule which would apply if the (5%) test was to 
look at the total number of accidents in a 3-month period. 


e) Give one other advantage of choosing the 3-month test period. 


Summary exercise 8 


For the following hypothesis tests, state 
whether you would use a one- or two-tail 
test, and find the critical region and the exact 
significance level it will give. 

i) Fora binomial with n = 10, testing 
H,: p = 0.27 against H, : p < 0.27 witha 
5% level of significance. 

ii) For a Poisson, testing H,: A=65 
against H,: A # 6.5 with a 10% level of 
significance. 

A test is being carried out on a Poisson 

distribution H,: A = 4.5 against H,: A < 4.5. 

A 2% test is to be carried out. 

i) Find the critical region for the test. 

ii) Calculate the actual significance level of 
the test. 

iii) State the probability that a Type I error 


occurs. 


iv) Ifin fact A = 2, calculate the probability 
ofa Type II error. 

A city council claims that at least 95% of 

their residents are satisfied with the services 

the council provide. In a random sample of 

20 residents, 4 said they were not satisfied. 

i) Carry out a 5% test to investigate whether 
the council has overstated the level of 
satisfaction with the services provided. 

ii) State what type of error could have been 
made in this test. 


When a council published plans to redevelop 
an area in the centre of the town only 10% 
of local residents approved the plans. The 
council then modified the plans and, in 

a random sample of 20 local residents, 5 
approved the modified plans. 
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i) Show that a test at the 5% level of 
significance will accept that the 
proportion of local residents who 
approve of the modified plans is greater 
than for the original plans. 

ii) Do you think it would be a good idea 
for the council to go ahead with the 
modified plans on the basis of this 
increased approval rate? 


7. The number of accidents at a particular bend 
ona rural road used to follow a Poisson 
distribution with accidents occurring on 
average at a rate of one every 3 days. Local 
authorities put up new warning signs 25 days 
ago and want to check whether the rate of 
accidents has decreased since they did this. 
‘They ask their statistician to carry out a 5% test. 


a) State the hypotheses and find the critical 
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5. It is claimed that a certain 6-sided die is 
biased so that it shows a one less often than 
if it was fair. In order to test this claim at the 
5% level of significance the die is thrown 20 
times and the number of ones is noted. 


b) State what is meant by a Type I error, 
and find the probability that the test 
results in a Type I error. 

c) There were 4 accidents recorded in the 
25 days. Carry out the test. 

i) Given that the die shows a one on 1 of 
the 20 throws, carry out the test. 


d) Ifthe rate of accidents occurring had 
reduced to one every 5 days, find the 
probability of a Type II error. 

A certain blood disorder affects 1 in 5000 

people in the USA. A researcher believes 

that it may affect fewer people in his home 

country. He consults the medical records of a 

random sample of 22 000 people in his home 

country. 


On another occasion the same test is carried 
out again. 


ii) Find the probability of a Type I error. 


iii) Explain what is meant by a Type II error 
in this context. 


a 


The number of hurricanes hitting a certain 
island over the past 150 years has followed a 
Poisson distribution with mean 1.2 per year. 
Insurance companies suspect that global 
warming has now increased the mean and 
they should increase their storm insurance 
premiums. A hypothesis test, at the 5% level 
of significance, is to be carried out to test this 
suspicion. The number of hurricanes which 
hit the island in the next year will be used for 
the test. 


a) Using a suitable approximation, state the 
hypotheses and find the critical region 
for a test using a 5% level of significance. 

When the researcher analysed the 22 000 

records he found 1 person affected by the 

blood disorder. 

b) Carry out the test, stating the actual level 
of significance for the test. 

c) Find the probability of a Type II error if 
the blood disorder affects only 1 person 
in 10 000 in his home country. 


i) Find the rejection region for the test. 

ii) Find the probability of making a Type IT 
error if the mean number of hurricanes 
hitting the island has actually increased 
to 2.5 per year. 


d) Comment on his choice of sample size in 
light of your answer to part c). 


i) 
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A psychologist who is treating patients with 
a rare mental disorder observes that having 
them meditate in a quiet dark room for an 
hour daily reduces their violent outbursts. 
Medical literature suggests that people 
suffering from this disorder have a violent 
outburst on any day with a probability of 
0.15, independently of other days. 


a) She wants to test whether this activity 
decreases the probability of the patient 
having a violent outburst that day. State 
the hypotheses and find the critical 
region for the test. 

One violent outburst was observed in the 

experimental period. 

b) Carry out the test. 

‘The psychologist sets up a study with a 

random sample of 10 patients with this 

disorder from a national register and for 3 

randomly assigned days each she has them 

meditate in a quiet dark room for an hour. 


c) Acolleague comments that small sample 
tests are not reliable. Give a reason why 
the psychologist may have opted to use a 
small sample despite this. 


Chapter summary 


The null hypothesis will be in the form H,: p =... or H,: A=... and the alternative hypothesis 
is in one of the forms H,: p >... or p<... or p#... (or similar for the Poisson). 

If the alternative hypothesis is A or p #... then the test is two-tail. The other forms give a 
one-tail test. 

‘The significance level of the test is the probability that a test will reject the null hypothesis 
when it is actually true. 

Type I errors occur when the null hypothesis is incorrectly rejected - so 

P(Type I error) = significance level in any test. 

Type II errors occur when the null hypothesis is incorrectly accepted. The probability of 

a Type II error occurring depends on the particular value of the parameter but it is the 
probability that the distribution with that parameter value produces an observation lying in 
the acceptance region. 

Conclusions for a hypothesis need to refer to the evidence - there is evidence to suggest 

the null hypothesis is false, or a lack of evidence to suggest it is false (when the null is to be 
accepted) and it is important that conclusions do not imply something has been proved, or 
that there is positive evidence in support of the null hypothesis. 

‘The critical region (or rejection region) for a hypothesis test are the set of values of the test 
statistic for which the null hypothesis will be rejected. The critical value(s) are the boundary 
value(s) of the critical region. 

The acceptance region for a hypothesis test are the set of values of the test statistic for which 
the null hypothesis will be accepted. 

Hypothesis tests for the binomial and Poisson distributions are carried out by comparing 
the observed value with the critical region, or by calculating the likelihood of seeing an 
observation at least as extreme at what was obtained, and comparing that probability with 
the significance level of the test. 
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Hypothesis testing using the 


normal distribution 


The rise of technology now means that there are 
often such large amounts of data that hypothesis 
testing is not really helpful - because with very 
large n almost every difference is significant 
even if it is not large enough to be an important 
difference, but there are still lots of applications 
like control systems for production lines which 
are firmly rooted in the hypothesis testing 
framework. 


Objectives 

After studying this chapter you should be able to: 

e Formulate hypotheses and carry out a hypothesis test concerning the population mean in 
cases where the population is normally distributed with known variance or where a large 
sample is used. 

e Calculate the probabilities of making Type I and Type II errors in specific situations 
involving tests based on a normal distribution. 


Before you start 


You should know how to: Skills check: 
1. Calculate probabilities for the normal 1. IfX ~ N(57,12) find the value of x for which 
distribution, e.g. P(X > x) = 0.025. 


If X ~ N(83.2, 10.2) find the value of x for 
which P(X < x) = 0.025. 


9.1 Hypothesis test for the mean of a normal distribution 


In Chapter 8 you met the logical basis for hypothesis testing, where the 
sampling distribution of a statistic was used to identify what would be the 
most unlikely events to be observed if the null hypothesis was true. This set 
of outcomes then formed the critical region for the test. In Chapter 4 and 
Chapter 7 you developed the distribution of the sample mean when the 
population was a normal distribution. 


Putting these together we can set out how to test for the mean of a normal 
distribution. As before, the construction of the test has a number of important 
steps in it: 
e Identify formally the null and alternative hypotheses. 
o The null hypothesis must give a distribution, i.e. in this context it is 
always in the form H,: U =... 
© The alternative hypothesis is in one of the forms: H,: > ... or 
u<... or u#... (the first two are the one-tail tests and the third is the 
two-tail test). 
e Give an explicit statement of the decision rule to be used. 
o There are two standard formats for this: 
> State the critical value, and then work out the z-score for your 


observed value, ie. calculate the value of z = ~—“ =f, So for a 5% 
n 
level of significance the critical values will be as follows - for the 
two-tail test z = +1.96 and for the one-tail tests z = 1.645 
or z= —1.645. 
> Or: ‘Accept H, at the 5% level of significance if ¥ < pr + 1.645 x 


vn 


if the alternative hypothesis is for j1>.... If the alternative hypothesis 
is u#... then the decision rule will be ‘Accept H, at the 5% level of 


significance if ¥ € [« — 1.96 x 7 M+ 1.96 x +}. 
e Apply the decision rule to the value of X or z. 
e@ Give the conclusions of the test in the context of the problem. 
o The statement of the conclusions must avoid categorical statements. 
A test allows you to state either 
1 Accept H,, This is not because the evidence supports it, and should 
not be reported as such. You can only conclude there is not significant 
evidence against it. 
2 Reject H,. Conclude that there is evidence, significant at the 5% level 
(or whatever the level was), that the mean is not ... (or of an increase/ 
decrease in the mean if it is one-tail). 
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Note: if you are told that the sample was discovered not to be random, then the 
conclusion of your test is invalidated. 


It is worth looking at a number of sampling distributions to highlight the 
importance of the size of the sample in being able to discriminate where 
there has been a shift in the mean. All of the diagrams below use the 
same scale (so the area under all the curves remains 1) so you can see 
what difference increasing the sample size has: 


f(x) f(x). f(x) f(x): 
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One important aspect to realise is not only does the sampling distribution 

under H, shrink in width (so the acceptance region becomes much narrower) 

but for any particular value of the alternative hypothesis the sampling distribution 
has the same profile as these do, but centred on the other value — so the probability 
of the observed sample mean now falling in the acceptance region (which would 
be a Type II error) reduces very quickly indeed. 


Example 1 

A machine has produced bolts over a long period of time, where the length in millimetres 

was distributed as N(20.0, 0.16). It is believed that recently the mean length has changed. 

To test this a random sample of 10 bolts is taken and the mean length is found to be 19.8 mm. 
Carry out a hypothesis test at the 5% significance level to test whether the population mean 
has changed, assuming the variance remains the same. 


The test 0.0 vs H,: ¢ # 20.0, with o = 0.4 and n = 10. 
Method 1: ‘The critical values for the 5% two-tail test are z = +1.96. 
z= ed = See = —1.581, which lies in the acceptance region (between —1.96 and 1.96). 
vn Vio 
Method 2: Accept H, at the 5% level of significance if x € [« = 1.96 x -%, w+ 1.96 x =} 
vn vn 


So accept H, if ¥ < (200 - 1.96 x 24, 20.0 +1.96 x 44) = 09.752, 20.248). 


vio’ i 


Since 19.8 does lie in this interval you accept H,. 
In either case the conclusion is Accept H,; there is not sufficient evidence to suggest that 
the mean has changed? 
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Example 2 
The amount of rainfall on a certain island over the past 30 years has followed a normal 
distribution with mean 82.3 cm per year, and standard deviation 15.3. Scientists suspect that 
global warming has now increased the mean. A hypothesis test, at the 5% level of significance, 
is to be carried out to test this suspicion. The average rainfall on the island over the next 5 years 
will be used for the test. 
a) i) Find the rejection region for the test. 

ii) What is the probability of making a Type I error? 

iii) Find the probability of making a Type II error if the mean rainfall on the island has 

actually increased to 105 cm per year. 


One of the scientists believes 5 years is too long to wait to carry out this test, and proposes 
to carry out the test using the rainfall in one year. 


i) Find the rejection region for this test. 
ind the probability of making a T; 
H, : M = 82.3 vs H, : > 82.3, with o = 15.3 and n= 5. 
i) ‘The critical values for the 5% one-tail test are z = 1.645 so the rejection region is 
F>82.3+1.645 x 153 = 93.6 


v5 


ii) Because the normal distribution is continuous, the probability of a Type I error for 
a 5% test is always 5%. 


iii) If X ~ N(105, 15.3”), then 


153 
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so the probability of a Type II error would be 4.8% for this test. 


P(X< ae ee 3 el 1- (1.666) = 0.048 


‘The critical value for the 5% one-tail test is still z = 1.645 but 
the standard deviation of the test statistic is now 15.3, so the 
rejection region is x > 82.3 + 1.645 x 15.3 = 107.5. 


if X ~ N(105, 15.32), then 


107.5—105 


=0.163)= (0.163) = 0.565 
153 


P(X <107.5)= »(z < 


so the probability of a Type II error would be 56.5% for this. 


Hypothesis testing using the normal distribution 


Exercise 9.1 


1. For each of the following situations carry out the appropriate test using one 
of the methods listed in Example 1 above. 


i) X~N(w 16). Hj:u=38.3 vs H,: 438.3. 5% test. 


n=20. DATA: X, = 40.6 
ii) X~N(w,40). H,:=38.3 vs H,:#38.3. 5% test. 
n=20. DATA: X, = 40.6 


iii) X~ N(u, 40). H,:=38.3 vs H,:" 438.3. 5% test. 
n=60. DATA: X, = 40.6 


2. For each of the following situations carry out the appropriate test using the 
other method listed in Example 1 above. 


i) X~N(uw, 19). H):m=813 vs H.:<81.3. 5% test. 


n=20. DATA: x. 
ii) X ~ N(u, 34). H,: H,:@<81.3. 5% test. 
n=20. DATA: 53 


iii) X~ N(u, 34). Hy:@=813 vs H,:u< 81.3. 5% test, 
n=60. DATA: X, = 79.5 


3. Look at the differences between the three tests in each of questions | and 2. 


i) How does the change in variance affect the outcome of the test when 
everything else is the same? 


ii) How does the change in sample size affect the outcome of the test when 
everything else is the same? 


4. Jars of jam are filled by a machine. It has been found that the quantity of 
jam in a jar is normally distributed and has mean 351.2 g, with standard 
deviation 4.1 g. It is believed that the settings of the mean amount on the 
machine might have been altered accidentally. A random sample of 40 jars 
is taken and the mean quantity per jar is found to be 349.9 g. Assuming that 
the standard deviation has not been altered, state suitable null and alternative 
hypotheses and carry out a test using a 5% level of significance. 


5. The systolic blood pressure of healthy young adults can be modelled by 
a normal distribution with mean 105 and standard deviation 7. It is thought 
that otherwise healthy young adults who do not eat the recommended 
amounts of fruit and vegetables may have higher blood pressure than usual. 


stom Hypothesis test for the mean of a normal distribution 


A random sample of 10 young adults who eat low amounts of fruit and 
vegetables but are otherwise healthy is taken and their systolic blood 
pressures are recorded here: 

119, 106, 104, 121, 116, 118, 108, 113, 108, 112 
Test at the 2% level of significance whether young adults who eat low 
amounts of fruit and vegetables have higher levels of systolic blood 
pressure than usual for healthy young adults. 


6. The usable life of batteries produced by a manufacturer is normally 
distributed with a mean of 85.3 hours and standard deviation 3.2 hours. 
The managing director wants to monitor production to ensure that the 
mean remains at 85.3 hours. His production manager suggests two 
monitoring plans: 

Plan A: test random samples of 5 batteries 
Plan B: test random sample of 30 batteries. 
i) For each plan, find the acceptance region for a 5% two-tail test. 


ii) For each plan, calculate the probability of a Type II error if the mean 
usable life shifts by 2 hours. 


The managing director looks at this information and asks the 
production manager to make a plan for a 5% two-tail test for which 

the probability of a Type II error if the mean usable life shifts by 2 hours 
is not more than 2.5%. 


iii) Find the minimum sample size which will satisfy this condition. 


9.2 Hypothesis test for the mean using a large sample 


If, however, you are told the population was not normal, or even not told that 

it is so there is no reason why you should assume that it is, then the conclusion 

is still valid if the sample size is large enough to invoke the Central Limit Theorem 

(n > 30) - you should explicitly refer to the Central Limit Theorem, not just refer 
vaguely to ‘large samples: Also with large samples you may use the unbiased 

estimate of the variance (based on the data) if you don't know the population variance. 


When the conditions are met for the binomial or Poisson distributions to be 
approximated by the normal distribution, you may set up a hypothesis test using 
the normal distribution. In this case you do not need to use a continuity correction. 


In the Poisson you use the sample mean to estimate the parameter A, and use it 
for both the mean and variance of the approximating normal distribution. 
In the binomial you use the sample proportion, p,, as the estimate of the parameter p, 


and use ,|2:4. as the standard error in constructing the critical values. 
n 
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Example 3 

A college thinks the average travel time for its students is about half an hour. A random sample 
of 80 students attending the college is taken and their travel times recorded in minutes. The data 
are summarised by ye = 2624; a = 96 283. 


i) Calculate an unbiased estimate of the variance of the travel times. 


ii) Test at the 5% level of the significance whether the mean travel time is half an hour. 
aahicpalssanaeteconasass 4 


2 ane 80/( 96283 > ¢ A “ 
i) 7=284_3297=— fe “i= sa a 32.8" fins is the unbiased estimate 


of the variance. 
ii) H,:“=30; H,:#30. 
Method 1: The critical values for the 5% two-tail test are z = +1.96. 


= 2.20 > 1.96, so reject H, and conclude that there is evidence at the 5% 


level of significance to suggest that the mean travel time is not half an hour. 
Method 2: Accept H, at the 5% level of significance if 


e(20- 1.96 x ie ,30 + 1.96 x ips). (27.5, 32.5). Since 32.8 does not lie in this interval 
80 80 


you reject H, (as in Method 1). 


Example 4 
Each multiple choice question in a test has 5 suggested answers, exactly one of which is correct. 
Novak feels he knows enough about the subject matter that he can do better than just guessing 
randomly because one or two suggested answers are usually obviously wrong. There are 50 
questions on the test and Novak gets 18 correct. 
i) State null and alternative hypotheses for a test of Novak's claim. 
ii) Using a normal approximation, test at the 1% level of significance whether Novak's claim 
is justified. 


seeeeeeeseseees se eeeeeeeeeesenseaeeseueeeeneeneeesssneneneesssseesneneeesssesessesessssseuseesaned 


i) if X is the number of questions Novak answers correctly then X ~ B(50, p) and the hypotheses 
areH,:p=0.2 vs H,:p>0.2. 
ii) Using the normal approximation in a 1% one-tail test (critical value 2.326) gives the test as 
0.2« 0.8 
50 
Novak gets 18 out of 50, or 36% correct, so reject H, and conclude that there is evidence at 
the 1% level of significance to suggest that Novak does better than he would if he was just 


‘Accept H, if the proportion observed is > 0.2 + 2.326 x = 0.332’. 


guessing randomly. 


Hypothesis test for the mean using a large sample 


Example 5 


The number of a certain endangered species of frog found in a wetland area can be modelled 
by a Poisson distribution with an average rate of 0.013 frogs per square metre. Conservationists 
are concerned that the species is in decline in the area because of recent construction work in 
neighbouring areas. 


They mark off a square in the wetland with sides of 100 metres, and count the number of frogs in 
the area to test if the numbers are in decline. 


i) State null and alternative hypotheses for the test. 
ii) Using a suitable approximation, find the critical region for a 5% test. 


iii) If the mean rate has dropped to 0.009 frogs per square metre calculate the probability that a 
Type I] error will occur. 


i) For the area selected, 10000 square metres, the expected number of frogs is 130. So the 
hypotheses are H,:A=130 vs H,:A< 130. 

ii) Using the N(130, 130) to approximate the Po(130) distribution, and using a 5% one-tail test 
the critical region is “Reject H, if x >130—1.645x 130 = 111.2. 

iii) A Type IJ error occurs if the null hypothesis is accepted when it is not true. If the mean rate 
drops to 0.009, then the null hypothesis is not true and a Type II error can occur. In this case 
the distribution of the number of frogs observed in a square of side 100 metres would be 
Po(90), which can be approximated by the N(90, 90) distribution. Then 
P(Type II error occurs) = P(at least 112 frogs are observed) 

111.5—90 
= w(« SS—= -2266] = 1—0,9883 = 0.0116 
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Exercise 9.2 
1. The summary statistics of a number of random samples are given below. For each one, 
find the unbiased estimator of the variance and carry out the test specified. 
i) Hjim=45 vs H,:m#45. S%test. n=40. DATA: Yx = 1846; Dx = 86835 
ii) H,:@=57.5 vs H,:"<57.5. 5%test. n=63. DATA: Dx=3514; Lx" = 199281 
iii) H,:@=-0.5 vs H,:m@>-0.5. 5%test. m=51. DATA: Yx = 24.3; Sx" = 994.27 


2. Using suitable approximations, carry out the appropriate tests for each of the 
situations described below: 


i) X~B(145,p). H,:p=04 vs H,:p#04. 5%test. DATA: x=52 
ii) X~Po(A). H,:A=27.1 vs H,:A<27.1. 1% test. DATA: x=19 
iii) X ~ B(92,p). H,:p=0.8 vs H,:p>0.8. 2%test. DATA: x=65 


3. The spokesperson for a group of insurance companies claims that less one in five 
homeowners has had an increase of more than 10% in the premium for home insurance. 
A survey of 85 randomly selected homeowners asks if their home insurance 
premium has increased by more than 10% at the last renewal and 22 say that 
it has. Carry out a test at the 5% level of significance of the claim by the spokesperson. 
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4. The number of accidents recorded at a dangerous corner has followed a Poisson 
distribution with an average rate of 1.2 per week. A large warning sign is erected shortly 
before the corner. Over the next 6 months (26 weeks) a total of 20 accidents are recorded 
at the corner. Carry out a 1% test to investigate whether the erection of the sign appears to 
have reduced the number of accidents. 

5. Accertain variety of bush grows to heights which are normally distributed with mean 74.0 cm. 
A new fertiliser is introduced in the hope that this will increase the heights. The nursery 
owner records the heights of a large random sample of 1 bushes, and calculates that ¥ = 75.2 
and s = 5.3. 

i) She consults a friend who is a statistician as to whether or not there is evidence that the 
heights have increased. The friend calculates the test statistic, z, has a value of 1.867 correct 
to 3 decimal places. Calculate the value of n. 

ii) Using this value of the test statistic, carry out the test at the 5% level of significance. 

6. The number of letters received by a household on a weekday follows a Poisson distribution with 

mean 3.5. 

a) What is the probability that 
i) ona particular weekday the household receives two or more letters 
ii) the total number of letters received on ten successive weekdays is thirty or more? 
(You should use a suitable approximation.) 

b) Explain briefly why a Poisson distribution is unlikely to provide an adequate model for the 
number of letters received on a weekday throughout the year. 

c) On the Monday before Christmas, 7 letters were delivered to the household. Test at the 
1% level of significance whether the mean number of letters per day is higher in the week 
before Christmas than normally. 


9.3 Using a confidence interval to carry out a hypothesis test 


Consider the process used in Method 2 for a two-tail test, where the acceptance region is identified, 
and the test carried out by seeing if the observed sample mean lies within the interval centred on the 
(hypothesised) population mean, uu. If you construct the corresponding confidence interval (a 95% CI 
corresponds to a 5% significance level test) there is a one-to-one correspondence between occasions in 
which p lies in the CI centred on X and when X lies in the acceptance region — which is centred on ju. 


In situations where you obtain a confidence interval, you can then use it to test any claim made. 


Example 6 


A random sample of 40 bottles of Aguafresh gave unbiased estimates of 250.3 ml and 
15.8 mI’ for the population mean and variance. 
i) Calculate a symmetric 95% confidence interval for the population mean. 


ii) The manufacturer claims that the mean volume of Aguafresh bottles is 252 ml. State with a 
reason whether your answer to part i) supports this claim. 


iKtim Using a confidence interval to carry out a hypothesis test 


i) The 95% Cl is given by ¥ +1.96 


= 250.3+ 19628 = (249.1, 251.5) 


40 


ii) 252ml lies outside this CI so there is evidence to suggest that the mean is not 252. 


Exercise 9.3 
1. 


3. 


1. 


For the following sets of summary statistics construct the confidence interval indicated. 


Comment on the claim made in each case in the light of the confidence interval you 


have calculated. 


a) n=50; Yt=257; Yr =2433 90% CI 
b) n=117; Yx=8596; Vx =2186591 
c) n=82; Yv=-459; Yv?=5109 99% CI 


claim p= 4.5 
95% CI 


claim ps = 90 


claim pu = -6 


A random sample of 150 bottles of fruit juice gave unbiased estimates of 


600.8 millilitres and 26.8 millilitres’ for the population mean and variance. 


i) Calculate a symmetric 95% confidence interval for the population mean. 


ii) The manufacturer claims that the mean volume of fruit juice bottles is 602 millilitres. 
State with a reason whether your answer to part i) supports this claim. 


A random sample of 80 jars of pickle gave unbiased estimates of 248.5 grams 


and 16.8 grams’ for the population mean and variance. 


i) Calculate a symmetric 95% confidence interval for the population mean. 


ii) The manufacturer claims that the mean contents of the jars is 250 grams. 
State with a reason whether your answer to part i) supports this claim. 


Summary exercise 9 


Jars of mayonnaise are filled by a machine. 

It has been found that the quantity of 

mayonnaise in a jar is normally distributed 

and has mean 452.7 g, with standard 
deviation 6.1 g. It is believed that the settings 
of the mean amount on the machine might 
have been altered accidentally. 

i) The company might be concerned if the 
mean has decreased because the labels 
claim that the jars contain at least 450g. 
Give a reason why the company might be 
concerned if the mean has increased. 


A random sample of 70 jars is taken and the 

mean quantity per jar is found to be 447.9 g. 

ii) Assuming that the standard deviation 
has not been altered, state suitable null 
and alternative hypotheses and carry out 
a test using a 5% level of significance. 

The pulse of healthy young adults can be 

modelled by a normal distribution with 

mean 80 and standard deviation 9. It is 
thought that trained athletes may have lower 
pulse rates than usual. 
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A random sample of 12 trained athletes is 
taken and their pulses are recorded here: 

48, 52, 59, 45, 57, 52, 44, 58, 56, 44, 63, 50 
Test at the 5% level of significance whether 
trained athletes have lower pulse rates than 
normal healthy young adults. 


i) Givea reason why, in carrying out a 
statistical investigation, a sample may be 
used rather than a complete population. 

The spokesperson for a group of travel 

companies claims that 40% of their holidays 

are cheaper this year than they were last year. 

A survey of 65 randomly selected customers 

asks if their holiday is cheaper this year than 

last year, and 15 of them said yes. 

ii) Show that a test at the 5% level of signi- 
ficance of the claim by the spokesperson 
using these data would reject the claim. 

iii) When challenged about the claim, using 
the test as evidence, the spokesperson says 
that people may be paying more for a better 
holiday this year and her claim was 
comparing like for like prices. Comment 
on whether this might be a valid defence 
of the claim made. 


The number of accidents recorded on a 
campsite during past summers had followed 
a Poisson distribution with an average rate 
of 2.3 per week. The owner employed a 
lifeguard last summer in an effort to improve 
the safety record at the campsite. Over the 3 
months the lifeguard works (13 weeks) a total 
of 18 accidents are recorded at the campsite. 
Carry out a 5% test to investigate whether 
employing the lifeguard appears to have 
reduced the number of accidents. 


A certain species of rose grows to heights 
which are normally distributed with mean 
64.0cm. A hybrid is created which is hoped 
will increase the heights. The breeder records 
the heights of a large random sample of 60 
bushes, and calculates that ¥ = 65.2 and s = 5.6. 
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Test at the 1% level of significance whether 
the height of the hybrid is greater than that 
of the original species. 


EXAM-STYLE QUESTIONS 


6. 


The lengths, in centimetres, of pencils 
produced in a factory have mean “and 
standard deviation 0.5. The value of ju is 
supposed to be 14, but a manager claims that 
one machine is producing pencils that are 
too long on average. A random sample of 

45 pencils from this machine is taken and the 
sample mean length is found to be 14.12 cm. 


a) Test at the 5% significance level whether 
the manager's claim is justified. 


b) Comment on whether you needed to 
assume that the distribution of the length 
of pencils is normal. 


In the past, the time, in minutes, for a 
particular minor medical procedure has 
been found to have mean 34.2 minutes and 
standard deviation 2.6. A new method is 
being considered in the hope that the average 
time would be shorter. A random sample of 
50 procedures using the new method is taken 
and the mean time is found to be 33.5 minutes. 
a) Carry outa test at the 5% level of 
significance to see whether the mean 
time for the procedure has decreased. 
If the new method is to be adopted, nursing 
and surgical staff will require extra training. 
b) What factors should the hospital 
administrators take into account when 
deciding whether to adopt the new 
method? 


Road accidents at a particular bend in a 
road could previously be modelled by a 
Poisson distribution with an average rate of 
4 accidents in a 30-day period. New speed 
limits were introduced a year ago. Since then 
35 accidents have occurred at that bend. 


Using a suitable approximation, test at the 
5% level of significance if there has been a 
reduction in the number of accidents at that 
bend. 


A local health authority has data showing 
that 43% of adults in its area take no 
regular exercise. They set up an advertising 
campaign to promote the benefits of taking 
regular exercise. 

After a month they conduct a random 
sample of 500 adults in its area and 243 say 
that they take some regular exercise. 


a) Using a suitable approximation, test at 
the 1% level of significance if there has 
been an increase in the number of people 
taking regular exercise in that area. 

b) State what is meant bya Type I error 
in this context, and state what the 
probability is of making a Type I error. 

c) Evaluate the way the test was carried 
out if it is intended to measure the 


effectiveness of the advertising campaign. 


Chapter summary 


The logic of hypothesis tests with continuous distributions is the same as for discrete 

distributions but you will be able to construct a test with any specified level of significance. 

© The null hypothesis must give a distribution, i.e. in this context it is always in the form 
Hy: b=... 

The alternative hypothesis is in one of the forms: H,: >... or 1 <... or t #... (the first two 

are the one-tail tests and the third is the two-tail test). 

Give an explicit statement of the decision rule to be used. 

‘There are two standard formats for this: 

© State the critical value, and then work out the z-score for your observed value, i.e. calculate 


the value of z = ~—“ 
o 


. So for a 5% level of significance the critical values will be as 


vn 


follows: for a two-tail test z = +1.96 and for the one tail test z = 1.645 or z = -1.645. 
© or: ‘Accept H, at the 5% level of significance if ¥ <u +1.645 <a? 


z 


n 
hypothesis is for > ... . If the alternative hypothesis is 1 #... then the test will be 
‘Accept H,, at the 5% level of significance if ¥ € [« = 1.96 x 2, + 1.96 x =} 


vn vn 


if the alternative 


Apply the decision rule to the value of X or z. 

Give the conclusions of the test in the context of the problem. 

The mean of a large sample from a population not known to be normal can be found using 
the Central Limit Theorem. 

For large samples from binomial and Poisson distributions the normal distribution may be 
used as an approximation if the required conditions are met, and the test carried out in the 
standard process. 

If an @% confidence interval has been constructed for a population mean, y, then a 

(100 - o)% level significance test can be carried out by inspection — by considering whether 
the hypothesised mean, y1, is captured in the @% confidence interval. 
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Maths in real-life 


A risky business 


Companies would like to reduce the risk of various parts of their business 
as much as possible if they can do so at a reasonable cost. Financial 
products called derivatives are one of the ways of doing this and a 

simple example can illustrate why such markets have developed and are 
sustainable. Fuel is one of the biggest costs in an airline’s business model, 
but the price on the open market fluctuates. Oil companies have a business 
model where the costs of extracting and refining oil into the fuel used by 
airlines are very substantial, and the price they are able to sell the fuel at is 
determined by market conditions after the company has spent that money 
in production. A forward contract where the airline commits to buying an 
amount of fuel from the oil company over the next year at a price which is 
fixed now can be beneficial to both companies in terms of their planning. 
It is likely that one or other of the companies would have been better off 
(this happens unless the price agreed is exactly what the market price 
ended up through the year) but they are both happy to have the security 
of removing the risk. The fair pricing of a contract like this involves 
mathematics and statistics as well as knowledge of what market conditions 
may influence price movements over the period of the contract. 


Large financial institutions and wealthy individuals will put together a 
portfolio of investments — aiming for high returns with low volatility 

— meaning just that the returns are consistently high rather than very high 
sometimes and low on others. 


Zig (Maths in real-life 


Diversification across sectors of the economy, and 
types of financial product, also helps to minimise 
the risks associated with a severe downturn in 

a particular sector. Increasingly sophisticated 
derivative products have been devised to attract 
investors — one of these is statistical arbitrage which 
illustrates the benefits of using large numbers of 
smaller investments (instead of a small number of 
large investments) to produce more stable returns. 
It exploits pricing inefficiencies between pairs of 
securities in the same sector. The principle is quite 
straightforward: 


e Identify a pair of stocks, in the same sector, with 
similar company characteristics which you consider 
to have a relative difference in value. 

e Buy forward the one you consider relatively undervalued, and 
sell forward the one you consider relatively overvalued. 

e Any movement in the sector will apply to both; if your 
assessment was correct the relative difference in price change is 
likely to be in your favour. 

e You can improve the risk / reward profile of this endeavour by 
applying it to a large number of smaller investments where the 
laws of large numbers kick in. 

The graphs below illustrate the difference in the volatility of the 

three strategies used by A, B and C where A invests a large amount 

in each of a small number of trades, C invests a small amount in a 

large number of trades and B is in between. 


Comparing investment strategies Comparing investment strategies 
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On average, the three strategies give the same return, but C is very 
stable — very little difference between simulations whereas A is 
extremely variable between trials. 
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Exam-style paper A 


1. Packets of sugar have weights that are distributed with standard deviation 
3.2g. A random sample of 120 packets is taken. The mean weight of the 
sample is found to be 748 g. Calculate a 95% confidence interval for the 
population mean weight. 


2. Ariana wants to choose one child at random from her three nieces. She 
numbers the nieces 0, 1 and 2. She then tosses two coins and chooses the 
niece whose number is the number of heads showing on the two coins. 
a) Explain why this is an unsatisfactory method of choosing a niece. 


b) Describe briefly a satisfactory method of choosing a niece. 


3. People arrive randomly and independently at a supermarket checkout at 
an average rate of 4 people every 5 minutes. 
a) Find the probability that at most 2 customers will arrive during a 
randomly chosen 3-minute period. 
b) Find, in seconds, the longest time period for which the probability 
that no customers arrive is at least 5%. 


4. A survey taken last year showed that the mean number of mobile phones 
per household in a town was 2.3. This year a random sample of 50 households 
in the same town answered a questionnaire with the following results. 


Number of mobile phones | 0 1 2 3 4 5 | 35 
Number of households 2. 7 Tae |}! U8: |) 9) 3 0 


a) Calculate unbiased estimates for the population mean and variance of 
the number of mobile phones per household in the town this year. 


b) Test at the 5% significance level whether the mean number of mobile 
phones per household has changed since last year. 


5. A shop sells deodorants for men and women. The demand for each type 

is independent of the other and on average the shop sells 3 per hour for 

men and 4 per hour for women. 

a) Find the probability that in a randomly chosen hour the shop sells 2 
of each type. 

b) Find the probability that in a randomly chosen hour the shop sells 
4 deodorants altogether. 

c) Given that the shop sells a total of 4 deodorants in an hour, calculate 
the conditional probability that it sold the same number of deodorants 
for men and women. 


Exam-style paper A 


50 Marks 


[3] 


[5] 


[3] 


[3] 


[2] 


6. Acontinuous random variable X has probability density function given by 


atx 0SxS1 
f(x) = 2 where a is a constant. 

0 otherwise 
a) Show that the value of a is *. [3] 
b) Find the median of X. [3] 
c) Find E(X). [3] 


7. Simona suspects that a random digit generator is biased so that the 
probability, p, that it will produce a zero is greater than 0.1. She tests 
the generator by producing 10 digits. If it shows a zero on 3 or more 
occasions she will conclude that it is biased. 
a) State what is meant by a Type I error in this situation and calculate 
the probability of a Type I error. [3] 
b) Assuming that the probability p is actually 0.25, calculate the 
probability of a Type II error. (3] 
Simona now produces 90 digits from the random digit generator, and 
they include 26 zeroes. 


c) Calculate an approximate 95% confidence interval for p. [4] 
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Exam-style paper B 50 Marks 


1. Heights of a certain species of rose are known to be normally distributed 
with standard deviation 12.8 cm. A botanist wishes to obtain a 95% 
confidence interval for the population mean height of the species, with 
total width less than 5cm. Find the smallest sample size required. [4] 


2. The random variable X has the distribution Po(11.2). Find the probability 
that the mean of a random sample of 80 observations is less than 11. (5] 


3. The volume of paint in cans is supposed to be 500 ml. The volumes, in ml, 
of paint in tins have a normal distribution with mean p and standard 
deviation 4.8. A supervisor thinks that cans contain less than 500 ml. 

The volumes of a random sample of 10 tins were 


497 501 495 490 505 500 496 494 500 491 


Carry out a hypothesis test at the 5% level of significance to determine if 
there is evidence that the mean volume is less than 500 ml. {5] 


4. The cost of hiring a car for a day consists of a fixed charge of $15 together 
with a charge of $0.1 per kilometre. The number of kilometres driven by 
people hiring a car for the day has mean 182 and standard deviation 42. 


a) Find the mean and standard deviation of the amount people pay 
when hiring a car for the day. [3] 


b) 8 people hire cars independently. Find the mean and standard deviation 
of the mean amount paid by all 8 people. [3] 


5. A drug used to treat high blood pressure is known to produce minor but 
uncomfortable side effects in 20% of patients taking the drug. A company 
produces a new drug that has similar benefits in treating high blood 
pressure which they claim has the same side effects in only 10% of patients 
taking it. In order to test the company’s claim that their drug reduces the 
proportion suffering this side effect, a doctor decides to give the new drug 
to a random sample of 30 patients who have high blood pressure, using the 
current proportion of 20% as the null hypothesis. He will conclude that 
there is not sufficient evidence to suggest the proportion is lower if at 
least 3 of the random sample suffer the side effect. 


a) Calculate the probability of a Type I error. [3] 
b) Ifthe company’s claim is true, calculate the probability of a Type II error. [3] 


c) In fact 4 of the people suffer from the side effect. State which error, 
Type I or Type II, might be made. Explain your answer. [2] 


iKetsm © Exam-style paper B 


6. The time in minutes for Younis to solve a puzzle can be modelled by the 
continuous random variable with probability density function given by 


ite{t <t<9 


0 otherwise, 


where k is a constant. 


a) Show that k= =. [3] 
b) Find the mean time taken by Younis to solve a puzzle. [4] 
c) Show that the median time taken by Younis to solve a puzzle 

is “a (= 5.81) minutes. [2] 
d) Find the probability that the time taken by Younis to solve a puzzle is 

between the mean and median time. [2] 


7. A bottling company produces two sizes of bottles of water. The small 
bottles contain a volume which is normally distributed with mean 501 ml 
and standard deviation 3.6 ml. The large bottles contain a volume which 
is normally distributed with mean 751 ml and standard deviation 4.8 ml. 


a) Find the probability that two large bottles and one small bottle contain 
less than 2 litres of water. [5] 


b) Find the probability that a large bottle contains more than 1.5 times as 
much water as a small bottle. [6] 


Exam-style paperB BU} 


Tables of the normal distribution 


If Z has a normal distribution with mean 0 and variance | then, 
for each value of z, the table gives the values of ®(z), where 
@(z) = P(Z<z). 


For negative values of z use ®(-z) = 1 - ®(z). 


z| 0 1 2 3 4 5 6 7 8 ig | ab ek ip $a 8 8 
0.0. |0,5000 |0.5040 0.5080 0.5120 |0.5160 0.5199 0.5239 |0.5279 0.5319 0.5359 |4 8 12 |16 20 24 |28 32 36 
0.1 [0.5398 |0.5438 0.5478 0.5517 |0.5557 0.5596 0.5636 |0.5675 0.5714 0.5753 |4 8 12 |16 20 24 |28 32 36 
0.2 |0,5793 |0.5832 0.5871 0.5910 |0,5948 0.5987 0.6026 |0.6064 0.6103 06141 |4 8 12 |15 19 23 |27 31 35 
0.3 |0,6179 |0.6217 0.6255 0.6293 |0.6331 0.6368 0.6406 |0.6443 0.6480 0.6517 |4 7 11 |15 19 22 |26 30 34 
0.4 |0,6554 |0.6591 0.6628 0.6664 |0.6700 0.6736 0.6772 |0.6808 0.6844 0.6879 |4 7 11 |14 18 22 |25 29 32 
0.5 | 0.6915 |0.6950 0.6985 0.7019 |0.7054 0.7088 0.7123 |0.7157 0.7190 0.7224 |3 7 10 |14 17 20 |24 27 31 
0.6 | 0.7257 |0.7291 0.7324 0.7357 |0.7389 0.7422 0.7454 |0.7486 0.7517 0.7549 |3 7 10 |13 16 19 |23 26 29 
0.7 | 0.7580 |0.7611 0.7642 0.7673 |0.7704 0.7734 0.7764 |0.7794 0.7823 0.7852 |3 6 9 |12 15 18 |21 24 27 
0.8 | 0.7881 |0.7910 0.7939 0.7967 |0.7995 0.8023 0.8051 |0.8078 0.8106 0.8133 |3 5 8 |11 14 16 |19 22 25 
0.9 | 0.8159 |0.8186 0.8212 0.8238 |0.8264 0.8289 0.8315 |0.8340 0.8365 0.8389 |3 5 8 |10 13 15 |18 20 23 
1.0 |0.8413 |0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 |0.8577 0.8599 0.8621 |2 5 7) 9 12 14 |16 19 21 
1.1 |0.8643 |0.8665 0.8686 0.8708 |0.8729 0.8749 0.8770 |0.8790 0.8810 0.8830 |2 4 6 | 8 10 12 |14 16 18 
1.2. |0.8849 |0.8869 0.8888 0.8907 |0.8925 0.8944 0.8962 |0.8980 0.8997 0.9015 |2 4 6|7 9 11 |13 15 17 
1.3. |0.9032 |0.9049 0.9066 0.9082 |0.9099 0.9115 0.9131 |0.9147 0,9162 0.9177 |2 3 5|6 8 10 |11 13 14 
1.4 |0.9192 |0.9207 0.9222 0.9236 |0.9251 0.9265 0.9279 |0.9292 0,9306 0.9319 |1 3 4|6 7 8 |10 11 13 
1.5 |0.9332 |0.9345 0.9357 0.9370 |0.9382 0.9394 0.9406 |0.9418 0.9429 0.9441 |1 2 4/5 6 7/8 10 11 
1.6 |0.9452 |0.9463 0.9474 0.9484 |0.9495 0.9505 0.9515 |0.9525 0.9535 0.9545 |1 2 3|4 5 6|7 8 9 
1.7. |0.9554 |0.9564 0.9573 0.9582 |0.9591 0.9599 0.9608 |0.9616 0.9625 0.9633 |1 2 3|4 4 5/6 7 8 
1.8 |0.9641 |0.9649 0.9656 0.9664 |0.9671 0.9678 0.9686 |0.9693 0.9699 0.9706 |1 1 2|3 4 4|5 6 6 
1.9 |0.9713 |0.9719 0.9726 0.9732 |0.9738 0.9744 0.9750 |0.9756 0.9761 0.9767 |1 1 2/2 3 4|/4 5 5 
2.0 |0,9772|0.9778 0.9783 0.9788 |0.9793 0.9798 0.9803 |0.9808 0.9812 0.9817/0 1 1|2 2 3/3 4 4 
2.1 10,9821 |0.9826 0.9830 0.9834 |0.9838 0.9842 0.9846 |0.9850 0.9854 0.9857 /0 1 1/2 2 2/3 3 4 
2.2 |0,9861 |0.9864 0.9868 0.9871 |0.9875 0.9878 0.9881 0.9884 0.9887 0.9890 /0 1 1/1 2 2/2 3 3 
2.3. |0,9893 |0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916 /0 1 1/1 1 2/2 2 2 
2.4 |0,9918|0.9920 0,9922 0.9925 |0.9927 0.9929 0.9931 0.9932 0.9934 0.9936 /0 0 1/1 1 1/1 2 2 
2.5 |0,9938|0.9940 0.9941 0,9943 |0.9945 0.9946 0.9948 |0.9949 0.9951 0.9952/0 0 O|1 1 1/1 1 1 
2.6 |0,9953|0.9955 0.9956 0.9957 |0.9959 0.9960 0.9961 |0.9962 0.9963 0.9964 /0 0 0/0 1 1/1 1 1 
2.7 |0,9965 |0.9966 0.9967 0.9968 |0.9969 0.9970 0.9971 |0.9972 0.9973 0.99740 0 0/0 0 1/1 1 1 
2.8 |0,9974 0.9975 0.9976 0.9977 |0.9977 0.9978 0.9979 |0.9979 0.9980 0.9981 /0 0 0|0 0 o 1 
2.9 |0,9981 |0.9982 0,9982 0.9983 |0.9984 0.9984 0.9985 |0.9985 0.9986 0.9986 |0 0 0|0 0 o 0 0 


Critical values for the normal distribution 


If Z has a normal distribution with mean 0 and variance | then, for each value 
of p, the table gives the value of z such that 


P(Z<z) =p. 
0.75 0.90 0.95 | 0.975 0.99 0.995 |0.9975 0.999 0.9995 
Zz 0.674 1.282 1.645| 1.960 2.326 2.576 |2.807 3.090 3.291 


Answers 


The answers given here are concise. However, when Exercise 1.3 page 9 
answering exam-style questions, you should show as 


=4)=-25 = 
many steps in your working as possible. 1, a) P(X =4) ==? x P(X = 3) 


ks ee P b) 0.134 
1 The Poisson distribution 
©) P(X=4) =0.134 
Skills check page 1 d) 2is the mode 
1. a) 0.0498 b) 0.1225 
2. a) P(X =5)=2x P(X =4)=P(X =4) 
2. 0.101 2 
b) 4and 5 are equal highest probabilities. 
Exercise 1.1 page 4 3. a) A=48 
1. i) 0.271 ii) 0.271 iii) 0.180 b) 4is the mode 
2. i) 0.165 ii) 0.298 iii) 0.268 : 
¥ 5 pee Exercise 1.4 page 11 
3. i) 0.124 ii) 0.174 iii) 0.116 ; st 
1 i) 3.2 ii) 3.2 
4. i) 0.670 ii) 0.268 iii) 0.054 
2. 49;7 
5. i) 0.269 ii) 0.104 iii) 0.016 
3. a) w=3.6; o=1.90 
6. i) P(X=2)=0.209 ii) P(X<2)=03380 b) 0.485 
iii) P(X > 2) = 0.829 c) 0.031 
7. a) P(X=2)=0.209 ayo 
b) P(X >2)=0.620 4. a) M=6; o=2.45 
8. a) P(X=2)=0.230 by 0394 
b) P(X <2) = 0.627 s) 00% 
d) 0.017 
Exercise 1.2 page 6 5. Note that there is no ‘right answer’ to this question. 
1. a) P(X=2) =0.238 In question 3, where A is smaller the distribution is 
- more heavily skewed (positively) and there are no 
By) RUS e0s3) possible outcomes more than 2 standard deviations 
2. a) P(X=2)=0.238 below the mean, while in question 4 it is possible - 
b) P(X<2)=0231 although the likelihood is still small. 


3. a) P(Y=2)=0.144 


Exercise 1.5 page 16 
b) P(X>2)=0.430 


- af. 12 
4. a) P(X=1)=0.164 1a) Yes A=a(=2 
b) P(X <2) =0.833 b) No ~ ina single lane the independence / 
c) P(X=6)=0.146 randomness required for the Poisson is lost. 


c) Unlikely because it is so close to Christmas and the 


5. a) P(X=2)=0.224 average rate is not likely to be the same as normal. 


b) P(X>2)=0.577 


Answers 


d) 


e) 


©) 


a) 
b) 


a) 
b) 
c) 


a) 


Unlikely to be the same average rate over the 
whole 24 hour period. 

ASE is likely to be more busy at this time than 
usual because of the heavy traffic at the end of the 
working week. 


a) and b) require the defects to occur independently 


of one another and at random. In part a) A= 1.6 
and in b) A= 4. 

‘The customers need to arrive independently of 
one another and at random times — in 20 minutes 
use A= 8 note that you might get two (or more) 
people entering the shop together but only one 
would actually be a customer. 

2.514; Var(X) = 2.36 

The mean and variance are not all that similar 
so the Poisson is unlikely to be a good model for 
this situation. 

i) 0.223 
0.193 
Probably not as the average rate in the middle of 
the night is unlikely to be the same. 


b) 0.169 ©) 


ii) 0.191 iii) 0.303 


0.335 0.222 


Summary exercise 1 page 17 


1. 


a) 


d) 
a) 
b) 
©) 
d) 


0.247 b) 0.821 c) 0.425 
i) 0.041 ii) 0.130 iii) 0.620 
i) 32 ii) 3.2 iii) 1.79 


P(X = k +1) = 32, P(X = k) which is multiplying 
by more than | when k is under 3, so the 
probabilities are increasing, but is then decreasing 
when k is 3 or greater, so 3 has the largest 
probability. 


i) 0.603 ii) 0.609 

i) 0119 ii) 0.159 

i) 64 ii) 64 — iii) 2.53 
6 

i) 0.119 

ii) 0.684 


Answers 


10. 


ll. 


12. 


13. 


In an hour on average 1.6 calls are received, so use 
Po(1.6). 


a) 0,258 b) 0.217 

a) 0,258 b) 0.525 

a) Mean =4; standard deviation = 2 
b) 0.371 ©) 0.547 

a) ¥=2.67; variance =3.89 


b) No - the mean and variance are not close - and 
the pattern of results (where no accidents gives 
a second peak along with 4) is not typical of the 
single modal form of the standard Poisson. 

a) P(one error) = 0.129 


b) For 20 pages the total number of errors, X, has a 
Poisson distribution with mean 3. 


P(X < 2) = 0.423 
c) The most likely (equal likely) are 2 and 3 errors. 


a) P(one error) = 0.329 

b) P(two errors on double page) = 0.217 

¢) P(one error on each page | two errors on double 
page) = 0.5 

a) P(X=2)=0.245 

b) P(X < 4) = 0.863 

c) 5 


a) E(X)=2.5; Var(X)=2.5; E(Y)=5; Var(Y)=10 


b) E(¥) # Var(¥), so Y cannot have a Poisson 
distribution. 


a) 0.220 b) 0.265 


a) 0.731 b) 2.5 days 


2 Approximations involving the 
Poisson distribution 

Skills check page 20 

1. 0.882 


2. 0.258 


Exercise 2.1 page 22 

1. a) X~B(60,0.05) 
i) P(X=0)=0.046 ii) P(X=1 145 
iii) P(X=2)=0.226 iv) P(X>2)=0.583 


approx 


b) X ~ Po(3) 
i) P(X=0)=0.050 ii) P(X=1) =0.149 
iii) P(X =2) =0.224 iv) P(X >2)=0.577 


2. a) > 50and np <5 (approximately) 
b) 0.000125 


approx 


©) X ~ Po(4.5) 
P(X < 3) = 0.174 
ome 
3. X ~ Po(4) 
P(X < 5) = 0.629 
4. a) X~B(130,0.005) 
P(X = 1) = 0.340 
ome 
b) X ~ Po(3) 
P(X <5) =0.815 
om 
5. a) X ~ Po(4) 
i) P(X=0)=0.018 
ii) P(X25)=0.371 


b) Samples vary and you are not told how the 
sample was taken: a sample of 1000 is quite 
large but it would not mean you could be sure 
the level of support was higher than the party 
claimed. (You will consider issues like this 
in more depth in Chapters 8 and 9 when you 
meet hypothesis tests.) 


i. sa) x ~B(180, 1) P(X = 1) = 0.362 


1 \7" (10 ‘a = 
b) x B(500, 2 ~'Po(2) P(X <5) = 0.756 


Exercise 2.2 page 25 


1. 


a) 
b) 
°) 
d) 


a) 


a) 
d) 


a) 


b) 


P(Y < 15.5) 

P(Y> 22.5) 

P(Y< 17.5) 

P(44.5 < Y< 61.5) 

Yes, N(16, 16) b) No ©) No 
0.142 b) 0.984 ©) 0.916 
0.163 e) 0611 

i) 0.424 

ii) 0.424 

i) 0.00016 

ii) 0.0377% 


If Ais greater than approximately 15 then the 
N(A, A)can be used with the continuity correction to 
approximate the Po(A) distribution. 


a) 0.185 b) 0.123 
a) 0.251 b) 0.614 
0.182 


Summary exercise 2 page 26 


1. 


a) 


i) 0.219 
ii) 0.337 
iii) 0.255 
iv) 0.190 


Note that the answer to iv) is 0.190 correct to 3 d.p. 
although using answers to parts i) to iii) to 3 d.p. will 
give 0.189. 


b) 


a) 
b) 
a) 


b) 


X ~B(60,0.025) "~ Po(1.5) 
i) P(X=0)=0.223 
ii) P(X=1) =0.335 
iii) P(X =2)=0.251 
iv) P(X>2)=0.191 


x ~ B(7500, 51)" po3.75)_P(X>3)=0.516 
200 


Smallest possible n is 9211. 


‘They need to arrive independently and at random 
(either condition will be enough). 


Using Po(5): P(X = 4) =0.175 


Answers 


©) Using Po(2.5): P(Y <3) = 0.544 
d) Using Po(30) “~” N(30, 30): P(W < 25) = 0.158 
a) i) 0.713 
ii) 0.360 
b) i) approx S| 


ii) 0.212 
P(X>2) =0.121 


Answers 


0.433 
a) B(5480, 0.001) 
b)  Po(5.48) 

<) 0.639 

0.221 

a) 0,924 


b) 6908 


3 Linear combination of random 
variables 


Skills check page 28 
E(X) = 2.4; Var(X) = 6.64 


1, 


Exercise 3.1 page 31 


1. 


a) 
b) 
°) 
d) 


a) 


b) 


©) 


d) 


a) 
b) 
a) 
b) 
©) 
d) 


a) 
c) 
d) 
e) 
a) 
b) 
°) 
d) 


a) 


E(2X+7)=184; Var(2X +7) =7.6 
E(4—3X)=-13.1; Var(4-—3X)=17.1 
E(X+3)=8.7; Var(X+3)=1.9 

E(7X) = 39.9; Var(7X) = 93.1 

i) E(X)= Var(X) = 1.29 

ii) E(SX-2)=13.5; Var(5X — 2) = 32.25 

i) E(Y)=0.1;  Var(¥) =1.29 

ii) E(SY-2)=-1.5; Var(SY - 2) = 32.25 
E(X) -3=0.1=E(Y);  Var(X) = 1.29 = Var(Y) 
i) E(V)=8.25; Var(V) = 4.8125 

ii) E(4+3V)=28.75; Var(4+3V) = 43.3125 
i) E(W)=-0.75, Var(W) = 4.8125 

ii) E(4+3W)=1.75;  Var(4 + 3W) = 43.3125 
iii) E(W)=E(V)- 9; Var(W) = Var(V) 

E(X) =-0.2;  Var(X) = 1.36 

E(7 —2X) = 7.4; Var(7 - 2X) = 5.44 


E(X) = 12.25; 

0.6 

E(0.9X) = 11.025; 

i) 11.25; 28.6875 

ii) Buying 4 x $5 vouchers would get $4 off 
instead of just a $1 reduction. 


b) E(X)=8 


Var(X) = 28.6875 


Var(0.9X) = 23.2 


P(X = 8) = 0.3 
Var(X) =1 

E(5 - 2X) =-11 
Var(5 — 2X) =4 


4; 8a + 11b = 3.95 
a=0.15 

Var(X) = 2.0475 

Var(5 — 3X) = 18.4275 


x BmbnnE 
S| 2) S|] 
P(X=x) | 95 | 35 | 35 | 25 | 25 


10. 


Exel 
1. 


b) P2<xX<5)= ©) E(X)=22 
d) Var(X) = 1.36 e)  Var(2— 3X) = 12.24 


a) E(X)=17.85 b)  Var(X) = 26.7275 
c) Var(20 — 3X) = 240.5475 
| 

a) k=d 
b) E(X)=5; Var(X)=3 
c) Makes.a loss if no more than 2 options are 

successful, so probability = 
d) E(P) = 3500; 

Var(P) = 6750000 
This would be a good investment portfolio, spread 
over a number of individual investments, with a low 
probability of making a loss and with an average 
return of 87.5% of the capital at risk. 


a) E(X)=2.8; Standard deviation = 1.29 
b) E(Y)=33; Standard deviation = 12.9 
rcise 3.2 page 37 

i) E(X+Y)=16; Var(X+Y)=7 

ii) E(X-Y)=-4; Var(X-Y)=7 

i) E(X+Y)=30; Var(X+Y)=6 

ii) E(X-Y)=0; Var(X-Y)=6 

i) E(X+Y)=3; Var(X+Y)=3.5 

ii) E(X-Y)=11; Var(X-Y)=3.5 


E(M) = 70.275 
Standard deviation of M = 11.1 


i) E(C)=105 
Standard deviation of C= 15 


ii) It is likely that many customers will usually 
use only one of the two branches but that 
there will be some customers who use both 
— depending on their movements on the day 
they need to visit the bank so independence 
of the number of customers is likely not to 
be true but assuming it is true could still be a 
useful first approximation. 

i) EC)=4 

ii) E(X)=25 

=42 
Var(x) = 2 


= Var(C) =t 


Answers 


Exercise 3.3 page 39 
Ls. i) o{x,)=90 


ii) E(X,,)=0; Var(X,,)=1 
3. i) #[ 3x, }=-2s0 
var( $x.) =1230 


ii) E(X.)=-5 


Var(X,) =0.5 
4. i) ox, }=250 
var($x;} = 261 


on 
ii) E(X,,)=35.2 
Var(X,,) =2.61 
5. The sample size needed is 6. 
6. ‘The sample size needed is 40. 


Summary exercise 3 page 41 
1. a) E(2X+3)=27.2 
Var(2X + 3) = 9.6 
b) E(5-3X)=-31.3 
Var(5 — 3X) = 21.6 
©) E(X-3)=91 
Var(X —3)=2.4 
d) E(X)=1089 
Var(9X) = 194.4 


Answers 


9 


i) EOO)=4 
Var(X) = 1.2 

ii) E(4X-3)=53; 
Var(4X — 3) = 19.2 


a) Yp=1l>a+b=04 
E(X) =9.2=> 14+ 8a+2.7+1411b=9.2 
8at+ 11b=4.1 

b) b=0.3; a=0.1 

c) Var(X) =2.16 

a) Var(3 - 2X) = 8.64 


x 1 2 3) 4 


yl ||! 255-1), 38 Pek 
PX=a) | G6 | ie | is | 16 


b) Pasxsa=2 


Var(X) = 0.859(375) 
e) Var(2 + 3X) =7.73 


i) E(X+Y)=17 
Var(X + Y) =6 
ii) E(X-Y)=3 
Var(X — Y) =6 
E(M) = 59.2 
Standard deviation of M = 5.60 
Note: it may be a little surprising that this is lower than 
either standard deviation which goes to make it up — 
but it is because you are taking the average of the two 
component marks. 
b) EQ)=8 
c) Var(X) = 0.689 
d) 3.67;11.0 
a) b=04; a=0.1 
b) PU<X<6)=04 
©) EX-4)=9.2 
e) Var(3X - 4) = 83.16 


a=08 b=184 


10. a) 39 b) 185.5 ©) 64x10 


4 Linear combination of Poisson 


and normal variables 


Skills check page 46 


I. 


2. 


0.257 


0.275 


Exercise 4.1 page 49 


1. 


2. 


a 2 RY 


a) 0.040 b) 0.120 <) 0.265 


a) i) 0.202 ii) 0.857 
b) i) S~Po(1.6); M~Po(5.4) 
ii) 0.827 


iii) 0.026 
=> T ~ Po(7) 


a) 0.169 b) 0.793 
Only part iii) is a Poisson, 4 = 8. 
0.132 


0.063 


Exercise 4.2 page 54 


Ll 


All of them have a normal distribution. 
i) 3X+2~N(17,63) 

ii) 2X +3Y~N(19, 118) 

iii) X+ Y~N(8,17) 

iv) X- Y~N(2,17) 

i) 2X+3Y~ N(970, 350) 

ii) 0.0544 

i) 5X+4Y~ N(450, 306.1) 

ii) 0.716 

i) 4A -3B~N(57, 130) 

ii) 0.570 

i) Mean = 3444.8; st. dev. = 105.56 
ii) 0.299 


6. i) 0.0105 
4 
ii) o{ Sw, < 3000) = 0.0787 
7. a) i) X~N(19518,259.24) ii) 0.868 
b) 1.000 
Summary exercise 4 page 55 
1. a) i) 0.0273 ii) 0.983 iii) 0.101 
b) i) Y~Po(3.9) ii) 0.352 
2. i) 0.187 ii) 0.00826 
3. i) E(C)=6676 
st. dev, of C= 532.48 => C ~ N(6676, 532.48") 
ii) (0.330) = 0.629 
4. i) 1-©(0.386) = 0.350 
ii) (0.877) =0.810 
5. (0.0694) = 0.528 
6 i) 14 ii) 0.0596 iii) 0.25 
iv) 0.909 
vy) P(T>3)=0.658 
7. 1-®(1,080) = 0.140 
8. 1 - (0,338) = 0.368 
9. 0.529 
10. 0.027 
11. 0.003 
12. a) 0.9641 b) 0.986 


Answers 


5 Continuous random variables Exercise 5.2 page 64 


1. a) Yes—the area of the triangle is | and 
all non-negative. 


Skills check page 58 


1, k=2 b) No — part of the pdfis negative. 
2 15 c) No-the area of the triangle is 1.5. 
d) No —each of the grid squares in 0.25 square 
Exercise 5.1 page 60 units and there are 4 squares fully shaded along 
with a lot of large fractions of other squares so 


1. a,cand dare continuous (although they will be 
reported to specified accuracies) while b and e are 
counts ~ discrete variables, although b would be 2 a) i) 1 ii) 0.5 
much harder to count than e would be. 


total is > 1. 


b) i) 0.25 ii) 0.5625 
2. t 
-3 = ail 
ae 3. a) k=3 b) k ) k=4 
06 d) k=3 &) k=3 
Bos 
a 4. a) + b) 2 
$04 Y ) is 
E 2 _u 
5 03 ) k=? P(X >)=3t 
E a2 
& 
; ad) i) BxX<y=t 
0.1 . 
ii) P(X=1.5)=0 
0.0 
o 12 3 4 5 6x ee) kee 
Duration of eruption (minutes) 6 
=2 
P(X <2.5)= 3 
Bsa 
—ae-f SxS. 
5, a) fixjayo%ts OS*S3 
) elsewhere 
3. f ‘ 
0.05 b) feyat"*2 O<x<1 
2 0.04 0 elsewhere 


i=J 
S 
& 


6. a) i) P(X > 2)=2 


Frequency densi 
o 
S 
8 


ii) P(X <05)= ed 


0.00 3 
0 10 20 30 40 50 60 70 80 90 100 ? 
Time between eruptions (minutes) b) i) P(X>0.2)=0.88 

4. f ii) P(X<1)=1 

50 

40 Exercise 5.3 page 69 

30 1. w=452 

ss Var(X) = 0.0830 

10 3 

a 2, a) k=z 
280 330 380 


Answers 


b) 


Co) 


a) 


b) 


°) 


a) 


b) 


©) 


a) 
b) 


©) 
a) 


b) 


b) 
©) 
d) 
b) 


u=2 
o=05 
i) POX> w)=3 


ii) o=0.707 > P(X> p+ 0) = 0.186 


i) P(X > w) = 282 


ii) = 0.8165 > P(X > ut — 20) = 0.950 


k=5 
25 
anc 
o? = 0.01984... 


i) P(X <p) = 0.402 

ii) o=0.1408... 
=> P(|X - p| <0) = 0.718 

E(X) =6 

k=12 

o? =0.15 

i) Ifthe bus arrives regularly but Gupta 
arrives at random times then the uniform 


distribution over the 15-minute interval is a 
sensible model. 


ii) =7.5 
Var(X) = 18.75. 


i) k=2 


ii) p= 


Mean = $2125, standard deviation = $1218 
ar 

P(loss) = 2 

P(better return) = 0.6875 


o=0.6 


Exercise 5.4 page 73 
lo a)soy 
£5 
1 
0.5 
0 05 Z 15 20'% 
b) i) Mode - there are two: 0 and 2 


a) 


b) 


ii) Symmetry of the distribution says that the 
median is 1. 


i) Median = V3 
ii) The pdfis strictly increasing so the mode is 
at the end of interval (at 5). 


yi 


le) 
1 2 3 4 5 


* 


i) Median = J2 
ii) Mode is 2. 


Answers 


©. i) a@=159 Summary exercise 5 page 73 


ii) Modeis 2. it 2 


b) P(O.S<x<1)=0.53 (2d.p.) 
©) E(X)=0.519 


Var(X) = 0.0842 


itt 
2 a) C=4 


b) P(l<x<2)=0.33 (2d.p.) 
2) BXY)=8 
Var(X) = 0.609 


3. a) PSx<2)=4 
b) E(X) =2 


Var(X) = 0.222 
b) The mode is at the maximum of the pdf - from 


the sketching process you can deduce it is at <) WXSw=h 
5, or you could use differentiation to identify 9 
where the gradient is zero. 4. i) P(2.7<x<2.9)=05 
YA ii) P(2.9<x<3) =0.34375 
0.24 5 a) P(x>0.5) =0.0625 
=i 
b) EX)=; 


Var(X) = 0.0267 


xv 


8. b) E(X)= = and Var(X) = 2.83 (3 s.f) 


9. m,m,m,m 
2 My» MyM, 


cei Answers 


6 Sampling than one person can have the same name so 


just removing duplicate names may actually 


Skills check page 76 remove customers. 
Ts. AUP yh Ss aysg. Bg taMR Byy Bye Bay 
32; 33 34 4165 42 43; 44 Exercise 6.2 page 81 
1, a) Unlikely to get a representative sample with 
Exercise 6.1 page 78 this method — any reason which identifies a 


group likely to be over- or underrepresented in 


the sample is a good answer, e.g. people who 
have more time are more likely to fill out the 
questionnaires 


Loa) i) A Bopestion is the eomelds group of b) Unless the route is fully booked the offer costs 
aes see servations):of interestin's the airline almost nothing, and getting people to 
ett iea eveay etn; fill out the questionnaire engages their attention 


i) Acctists te when iataemation wigathered to consider the possibility of flying on that route 


about all members of the population. at some point (and paying for a ticket!). 
b) Nosingle answer here - your example should 2. Anything where the sample is just done by who : 
show clearly why it is biased — for example, asking happens to be in a particular place at a particular time 
about health issues outside a fitness centre. is the easiest way to identify a convenience sample, 


e.g. outside a supermarket / train station / school ... 


2 a) i) Alist in which all members of the 3. a) If they were disposable batteries, then a census 
population appear. would not be possible as there would be no 
ii) A sampling unit can be an individual, or a batteries to sell. For rechargeable batteries it 
household, or the result of an experiment would be possible, but very expensive to test 
(e.g. how many heads occur in 3 tosses of a them all. 
coin). b) Testing all 50 in one box means that the box can 
b) i) Alladults in the UK; could use the latest be removed where taking only one or two from a 
electoral register, or census files if it is close box until you have 50 batteries means a number 
to the tine atiwhich the census was last of boxes are affected and have to be removed 
done (only done every 10 years). until they have 50 batteries again. That problem 
could be avoided by taking a random sample 
ii) The adults in a country change every day ~ of 50 at a point in the process before they get 
people move in and out; some die; others packed into the boxes. 
become adults (on their 18th birthday in 
the UK); and any such register will never 4. a) Acensus would be very time consuming and 
have everyone on it that should be there, expensive. 


so even on the day the information was 


collected it would not be 100% accurate. By! “They would bemore/concemedwithikeeping 


their important customers happy so they may 


3. Each member of the population has an equal chance wish to deliberately set up a sampling method 
of being selected, and all possible combinations are designed to focus on particular types of account. 
equally likely. 

4. a) All the customers of that bank. Exercise 6.3 page 85 
b) Use the account records held by the bank. 1. Methods a) and e) will give a random sample — in e) 

it does not matter that the sampling frame has been 
¢) An individual customer may hold more than constructed in a non-random manner because the 
one account, possibly even in more than one method of choosing the members of the sample is 
name (a woman might have accounts in her eandorit 


maiden name and married name), but more 


Answers 


2. The following samples for a) and b) are based on 
using the two-digit numbers in the table, ignoring 00 
and 91-99 and any repeats until 10 different numbers 
are obtained: 


a) 52,44, 17,71, 20, 63, 47, 88, 22, 02 
b) 52, 54,07, 08, 43, 49, 24, 73, 67, 64 


) Forasmall sample you could just ignore 00 and 
43-99 but it is very wasteful of the random 
numbers - so you could take 01-42 and also 
51-92 (after subtracting 50 from anything in this 
range). 


Just using 01-42 gives 17, 20, 22, 02, 18. 
Using the subtraction method gives 02, 17, 21, 20, 
13 - note that rather than taking the next block 
of 42 numbers immediately after the first, the 
arithmetic is much simpler if you subtract 50 - it 
means that 00, 43-50 and 93-99 are not used, 
which looks a little strange, but is much easier, 

d) Aslong as you specify what you intend to do 
before looking at the table of numbers there is 
no reason why it has to start at the top left. 


Just using 01-38 gives 07, 20, 22, 16, 11, 08. 


Using 01-38 and 41-78 (after subtracting 40 in 
this range) gives 07, 27, 36, 20, 25, 22. 


3. The actual samples depends on the numbers 
generated by your calculator. But it should give 
you an insight into the amount of variability in the 
answers taking small samples - everyone owned a 
mobile phone so there is no variability there; the 
favourite type of film does have variability, but it is 
not numerical so there is no easy way to summarise 
the variability. The times are all relatively small 
values with a number of repeats so the averages of 
your sample times are likely to be closer together 
than the average number of DVDs owned. 

4and 5. Again, the actual samples depend on the 
numbers generated by your calculator. These 
practical activities are designed to give you both a 
little bit of experience in the process of generating 
samples (pretty tedious to do lots of it ...) and the 
opportunity to get some understanding of just how 
much variability results from taking random samples. 


Exercise 6.4 page 90 


1. A statistic is just a function of (some or all of) the set 
of observations collected, so a to d are all statistics 
but e is not as it involves parameters. 


Answers 


. > 
S 1 = = 
2. a) Dx @ B(20, 3) b) Ex. = )- 0.0143 
=o0,pet =20, = 40 
©) n=20, p=} BX) =%; vari) = 4 
- P 10 7 
3. a) dx, re B(10, 3) b) pas <7) =0.474 
©) n= 10, p=3 = E(X)=7.5 Var(X) = 1.875 
4, a) p=26 
b) (L141) (413) (4310) (43,3) (G11) 
(3,13) (3,3,1) (3.3.3) 
These 8 possible samples have means: 
1,5, 2,2, 5,2, Zand 3 and 
vss 3 3 
the modes are 1, 1, 1, 3, 1, 3, 3,3 and 
the medians are 1, 1, 1, 3, 1, 3, 3,3 
Exercise 6.5 page 93 
1. a) E(X,)=5; Var(X,,)=0.4 
b)  E(Xj) = 26.3; Var(X,,) = 0.78 
©) E(Z)=2; Var(Z,) =0.1 
d) E(X,,)=0; Var(X,,) =0.18 
2. a) Var(X,,) =1.54 
b) The minimum sample size is 24. 
3. a) E(X)=5; E(X*)=33.5 = Var(X)=8.5 
b) E(X,)=5;  Var(X,,) = 0.708 
Exercise 6.6 page 95 
Loa) Kw N{ 600, 22°) b) P(X<597)=0.154 
2 a) Xy~ n(352, 7a} P(X,.2 350) = 0.920 
b) The smallest sample size is 28. 
3. a) The smallest sample size is44. b) 0.493 
4, a) 0.139 b) 0.166 
5. 0.140 


Exercise 6.7 page 99 


I. 
3. 


0.0942 2. 0.972 
a) 0.931 


b) The sampling distribution for the mean of a large 
sample uses the Central Limit Theorem to allow 
calculation of probabilities using the normal 
distribution as an approximation, but for small 
samples the distribution is not known and 
probabilities cannot be calculated. 


X ~ Po(7) => E(X) = Var(X) =7 

: 1 epi - 

) ars x, "~'n(7,2) 

ii) @(-1.5813) = 0.943 

X ~ B(10, 0.3) = E(X) = 3, Var(X) = 2.1 
: = me 94 

i) cT>X, ~N 3 

ii) 1- (1.5816) = 0.0569 


Summary exercise 6 page 102 


L 


While there are three outcomes (home win, draw and 
away win) these are not equi-probable outcomes (there 
tend to be more home wins in most league formats). 


a) Arandom sample is one in which each member 
of the population has an equal chance of being 
selected, and all possible combinations are 
equally likely. 

b) Any reason which identifies a group likely to 
be over- or underrepresented in the sample is a 
good answer. 


It would be very expensive to test every battery for its 
usable life before it was sold - and they would need 
to be charged again. 


Your answer will depend on the particular random 
sample you have taken, but many sample means will 
lie in, or close to, the interval 55-60, 


a) AB AC AD BA BC BD CA 
CB CD DA DB DC 


b) If A, B are 50 cent coins, C is the 20 cent 
coin and D is the 10 cent coin then the mean 
values of these samples (in the same order) are 
50 35 30 50 35 30 35 35 15 30 
30 «15 


Sample mean 15 | 30: | 35 | 50 


Ble 
5 
Ble 
s 


Probability 


10. 


11. 


E(X) =22.4 Var(X)=7.9 

a) Var(X,,) = 0.395 

b) The smallest sample size is 8. 
a) 


a) F-N( 305, 
b) 1 — (2.008) = 0.0223 
a) Mo ~N(252,2=) 


10 
(1.807) = 0.965 


b) The smallest sample size is 21. 
a) iL 

n 
b) Normal population, or large sample. 


Note that these answers are illustrative only - other 
reasons are possible. 


a) tis nota random sample - if a train was delayed 
all these people could be affected, and if the train 
has not yet arrived, the response is not the time 
they will wait for the train. 


b) Again, it is not a random sample, and attendance 
on a Monday morning may not be typical of the 
full week. 

c) This is a systematic sampling - and on 
production lines often one or more processes 
use a rotation of instruments; if one of several 
instruments is faulty then defective batteries 
will occur at regular intervals in addition to any 
random faults. 


a) It would be far too time-consuming to take 
acensus of this information (even if it were 
possible to be sure of measuring everyone). 


b) Any batteries that are tested are no longer able to 
be used (testing to destruction). 


Answers 


7 Estimation 4. i) 95% CI = (204.5, 230.5) 

; ii) 98% CI = (202.1, 232.9) 
Skills check page 106 5.) eens i essa tap) 
1. Mean = 11.17; variance = 31.81 


Exercise 7.5 page 121 
1. a) i) 95% CTis (3.15, 11.1) 
ii) 90% CT is (3.79, 10.5) 


There is no exercise for Section 7.1. 


Exercise 7.2 page 110 
1. a) 63.00 b) i) 95% CTis (41.2, 47.2) 
b) Reading across: 66.32, 63.24, 60.74, 67.62, ii) 90% Cl is (41.7, 46.7) 
60.14, 59.96 


©) i) 95% Clis (8.06, 12.9) 


©) 64.78, 64.18, 60.05 
ii) 90% Clis (8.45, 12.6) 


2. a) 61.12 
b) 64.5, 63.2, 59.26, 59.7, 59.6, 60.44 
©) 63.85, 59.48, 60.02 


d) i) 95% CT is (4.54, 12.84) 
ii) 90% Cis (5.20, 12.18) 


‘The aim of question 3 is to explore the behaviour of repeated e) i) 95% Cis (-0.593, -0.075) 
samples of different sizes, but there are no solutions here. ii) 90% Clis (-0.552, -0.116) 
Exercise 7.3 page 114 2. Not really —a little bit of uncertainly is removed because 


you do not need to rely on the Central Limit Theorem 


1. a) as an approximating distribution for the sample mean, 
b) but you produce the same Cl in both cases. 
°) 3. (323.9, 325.3) 4, (27.1, 29.5) 
4) 5. (502.34, 504.06) 6. (101.1, 103.1) 
e) 
zu @) Exercise 7.6 page 123 
b) 1. a) i) (0.226, 0.391) ii) (0.239, 0.378) 
4 ) b) i) (0.067, 0.215) ii) (0.079, 0.203) 
a 
b) c) i) (0.281, 0.525) ii) (0.301, 0.506) 
d) i) (0.202, 0.417) ii) (0.220, 0.400) 
4. a) e) i) (0.074, 0.214) ii) (0.086, 0.203) 
in (0.056, 0.184) 3. (0.188, 0.387) 
c 
4) er 4. i) (0.429, 0.524) 
) 2 1/54 ii) That the cars observed form a random sample 
$ es of all cars on the road (and that the data are 
5. a) Var(X) = 215.76 b) 8=231.2 accurate — not necessarily easy data to collect) 


iii) @=99 


6. y=240; sf=54 
4 2 5. i) (0.386, 0.510) 


ii) Need a sample of at least 1521 to have the CI this 


Exercise 7.4 page 118 small. 

1. (723.4, 760.6) 6 i) n=242 ii) @=95% 
2. i) (601.4, 611.9) ii) Narrower 

3. (33.8, 42.2) 


ikem Answers 


Summary exercise 7 page 124 
1 ¥=408; s= 1483 


2 24.8; 8 =2.61 
3. 41; $=39.6 
4. a) 3.92; s*=313 
b) ¥=36.3; s?= 1632 
5. 
6. ii) o=124 
7. a) i) 95% CLis (-2.19, 1.15) 
ii) 90% Cl is (-1.92, 0.88) 
b) i) 95% CTis (10.2, 13.0) 
ii) 90% CT is (10.4, 12.8) 
8. (11,7, 12.9) 9. (0.037, 0.177) 


10. (0.158, 0.309) 


11. i) = m=123 ii) @=98.4 


12. i 


13. i 


14. 


15. 
16. 


17. 


d = 3.9856; s* = 0.001 24 
96% Cl is (3.979, 3.992) 
By definition of Cls you expect on average that 


96% of the 96% Cls will contain ju, so on average 
there will be 8 out of 200 which do not. 


(0.384, 0.553) 

The people able to be in a shopping centre 
ona Wednesday afternoon do not mirror the 
population you are trying to make inferences 
about. 


(744.4, 751.6) 
(0.19, 0.33) 


39 
a) 


b) 


Mean = 3.28, variance = 0.0447 using exact ¥ 
in estimating variance, or 0.0697 using rounded 
value for mean. 


(3.22, 3.35) using exact X in estimating variance 
or (3.20, 3.36) using rounded value for mean in 
estimating variance. 


Answers 


8 Hypothesis testing for discrete 


distributions 


Skills check page 127 


1. 
2. 


x=3 


xa4 


Exercise 8.1 page 132 


1. 
2. 


E 


H,: p=0.23 vs H,: 


a p>0.23 
Do not reject H, and conclude that there is not sufficient 


evidence to suggest that the proportion has increased. 
H,: 2=045 vs H,: A#0.45 


Reject H, and conclude that there is evidence at the 
5% level of significance to suggest that the average rate 
of accidents at the weekend is different to weekdays. 


xercise 8.2 page 137 


The first few solutions here have extra details included about 
why the critical region stops where it does at either end of the 
distribution. 


1. 


186 


Two-tail test, critical region includes 0, 1, 9 and 10 
and the exact size of the test is 2.1% (including 2 and 
8 would increase it to 11%). 


One-tail test, critical region includes 0 and 1 and 
the exact size of the test is 1.1% (including 2 would 
increase it to 5.5%). 


Two-tail test, critical region includes 0 and 10 and 
the exact size of the test is 0.2% (including 1 and 9 
would increase it to 2.1%). 


Two-tail test, critical region includes 0, 8, 9, 10, 

11 and 12, and the exact size of the test is 2.3% 
(including 1 would increase it the bottom tail to 8.5% 
and including 7 would increase the top tail to 3.9%). 


One-tail test, critical region includes just 0 and the 
exact size of the test is 1.4% (including 1 would 
increase it the size to 8.5%). 


‘This example highlights one of the potential problems 
in testing with small samples - this should be a two- 
tail test, but to reject at the top end (including only 

15 in the critical region) gives a minimum of 3.5% 

in the top tail. At the bottom end the critical region 
includes 0-8, with a total probability of 1.8%. 


One-tail test, critical region includes just 15 and the 
exact size of the test is 3.5%. 


Would want a one-tail test, but smallest critical 
region, containing just 0, would have probability of 
13.5% so no test is possible. 


Answers 


10. 


11. 


12. 


Using the two-tail test it is impossible to reject at the 
bottom end, but at the top having the critical region 
of 6 or more has probability 1.7% (including 5 makes 
it 5.3%). 


One-tail test, critical region includes 0 to 3 and the 
exact size of the test is 3.0% (including 4 would 
increase the size to 7.4%). 


One-tail test, critical region includes 15 and higher, 
and the exact size of the test is 2.7%. 


The 10% two-tail test will simply include the two 5% 
one-tail critical regions and the exact size will be 
the sum of the two sizes there: critical region is 0-3 
or 15 and higher, and the exact size is 5.8% (or 5.7% 
using rounded figures). 


Exercise 8.3 page 141 


1. 


i) 16.7% 

ii) 16.7% (the size of the test is the probability of a 
Type I error) 

Type Il error is when null is incorrectly accepted 
- probability of 2 or more is 45.1%. 

i) 0,1 1.7% iii) 1.7% 

Type Il error is when null is incorrectly accepted 
- probability of 2 or more is 80.1%. 

i) 3.5% ii) 3.5% 

Type Il error is when null is incorrectly accepted 
= probability of 3 or more is 32.3%. 

a) i) ii) 4.7% 

b) i) Reject H, and conclude there is sufficient 


evidence at the 5% level to suggest that the 
mean rate is greater than 0.8. 


ii) 


3 or more 


ii) Typel- since the test rejects H, the only 
error possible is if H, were true. 

i) P(Po(3) = 0) = 0.0498; P(Po(3) = 0.1) = 0.199, 
so the critical region for a 5% test only contains 0. 
P(Po(0.3) = 0) = 0.741; P(Po(0.3) > 0) = 0.259, 
so if A = 0.3 the probability of a Type II error 
(accepting the null hypothesis incorrectly) is 
25.9% (>25%). 


a) i) 


it) 


P(X > 16) =5.1%; P(X> 17) = 1.6% so the 
critical region for a 5% test is X > 17. 

ii) 1.6% 

75% of 20 is 15, so the test would not reject H, 
meaning that a Type II error is the only type that 
could have occurred. 


b) 


Exercise 8.4 page 144 6. 
I. 


If she was guessing then the number of correct i) 
answers, X, she gets on the test can be modelled by 
the B(15,0.2) distribution. 
The test is H,: p = 0.2 vs H,: p > 0.2, and the critical 
region is 7 or more. P{B(15, 0.2) = 6} = 6.1%; P(B(15, 
0.2) = 7} = 1.8% so there is not sufficient evidence to 
suggest that she has done better than by guessing. 
There is nothing to suggest that more heads were 
expected before knowing that 10 out of 12 tosses were 1, 
heads, so you need to look at the two-tail test - ie. i) 
reject null hypothesis if 0, 1, 2, 10, 11 or 12 heads are 
seen since P{B(12, 0.5) = 0, 1, 2, 10, 11, 12} = 3.9%. 
i) Since 10 heads were observed, reject H, and 
conclude that there is evidence to suggest the 
coin is biased. 


ii) 


ii) Type is the only one possible since H, was 
rejected by the test. 


If X is the number of flights arriving more than 5 
minutes the published arrival time, then 

X ~ B(30, 0.04). P(X > 2) = 11.7%; P(X = 3) = 3.1% 
so the (one-tail) test, at the 10% level of significance, 
is ‘Accept H, if no more than 3 flights are more than 


5 minutes late. ii) 


Since 3 flights in the sample arrived more than 5 minutes 


after the scheduled arrival time, H, is accepted and you iii) 


can conclude there is insufficient evidence to suggest 

that the airline has overstated the proportion of flights 

arriving within 5 minutes of the published arrival time. 

Since the proportion arriving on time in the sample 

is 91.4%, which is greater than the 90% level that 

the company claims, it cannot be evidence that they 

overstate their performance. 

i) The critical region for a 10% test will always 
include all the critical region for the 5% test 
(and except for very small sample tests will be 
larger) — so the 10% test will be more likely to 3+) 
conclude the financial advisor has overstated 
their performance. 


ii) The test is of H,: p= 0.1 vs H,: p> 0.1; b) 


P{B(20, 0.1) > 4} = 13.3%; P{B(20, 0.1) > 5} = 
4.3% so the critical region is 5 or more. Since 4 y 
in the observed sample were not satisfied you 
accept H, and conclude there is not sufficient 
evidence to suggest that the financial advisor has 
overstated their performance. 

iii) Since H, was not rejected, a Type II error could 
have been made. 


iii) 


iv) 


d) 


H,: p= 0.25 vs H,: p< 0.25 


P{B(24, 0.25) < 2} = 4.0%; so the significance 
level of the test is 4.0%. 

4.0% 

P{B(24, 0.1) < 2} = 56.4%; so the probability 
of accepting H, incorrectly when p = 0.1 is 
(100 - 56.4) = 43.6%. 


Exercise 8.5 page 147 
H,:A=9.5 vsH,:4<9.5 


X ~ Po(9.5); P(X < 4) = 4.0%; P(X < 5) = 8.9%, 
so the critical region is 0-4. 

4.0% 

This value lies in the critical region, so the null 
hypothesis is rejected and you conclude at the 
4% (or 5%) level that there is evidence to suggest 
that the parameter is lower than 9.5. 


On average in a half-hour period there would 

be 10 hits, and the conditions for a Poisson 
(independence and randomness) are likely to 

be satisfied, so X ~ Po(10) is the appropriate 
distribution. 

PX <4) =e°(1+ 10+ + 1 5 10") g.029 
If it is a site that is likely to be of more interest 

to people in an area of the world with similar 

time zones (e.g. of interest mainly in Asia, or in 
Europe, or in the Americas), then there might be 
more activity on the site during the day in that 
region than at nighttime. 

P(X < 4) = 0.029, so 4 lies in a one-tail test critical 
region but not in the two-tail 5% test. The reasoning 
in part iii) suggests that a one-tail test for a decrease 
in the rate during the night is appropriate, but 
without it, the two-tail test would be the one to use. 


The conditions for a Poisson (independence 
and randomness) are likely to be satisfied, so 
X ~ Po(3) is the appropriate distribution. 


P(X > 5) 


i at 
e(i+3s+F+F42)\-o185 


If Yis the number of months in a year in which 
5 or more accidents occur then Y ~ B(12, 0.185), 
P(Y > 1) =1-0.815" = 0.914. So the probability 
the crossing will be installed is 0.914. 

Using a 5% significance level (one-tail test for 

a decrease) gives critical regions i) for 1 month 


Answers 


e) 


it) 


iii) 


iii) 


iv) 


a) 


b) 


c) 


d) 


O accidents (5.0% level) and ii) for 3 months up 
to 3 accidents (2.1% test). 

i) The 1-month test period is quicker, but ii) 
the 3-month test period is likely to be better in 
detecting a shift in the average rate (a lower 
Type II error for the same significance level is 
the reward for using a larger sample in a test). 


X ~ Po(2.5): P(X > 5) = 10.9%; P(X = 6) = 4.2%, 
so the critical region for a 5% test is 6 or more. 
4.2% 


This does not lie in the critical regions, so accept 
H, and conclude that there is not sufficient 
evidence at the 5% level to suggest that the 
parameter is greater than 2.5. 


The conditions for a Poisson (independence and 
randomness) are likely to be satisfied, so X ~ 
Po(2.5)is the appropriate dis! 


P(X < 4)= ~( 


The behaviour just before it closes might be 
affected by people deciding not to come in 
because they would not have time to do all they 
wanted - or making a special effort to get there 
because they have been working all day and it is 
the only chance they have - not possible to say if 
it will be but enough differences in conditions to 
suggest that the average rate might be different. 
‘The reasoning in iii) says that a two-tail test is 
appropriate: P(X > 5) = 4.2% (>2.5%) so do not 
reject H, and conclude that there is not sufficient 
evidence at the 5% level to suggest that the 
average rate is different in the last half hour. 


The conditions for a Poisson (independence 
and randomness) are likely to be satisfied, so 
X ~ Po(2.2) is the appropriate distribution. 


POX25)=1—e (142.2425 4 22 
2! 3! 


= 0.0725 
If Y is the number of months in a year in which 
5 or more accidents occur then Y ~ B(12, 
0.0725), P(Y = 1) = 1 — 0.9275" = 0.595. So the 
probability that a speed limit is imposed is 0.595. 
i) Using a 5% significance level (one-tail test 
for a decrease) it is not possible to construct a 
test using only one month since the probability 
of no accidents in a month is 11.1%. 
ii) For 3 months up to 2 accidents (4.0% test) 


Answers 


e) A longer test period is always better in detecting 


a shift in the average rate and in this instance 
there is no choice to be made because a 1-month 
period is not long enough even to get a 5% test. 


Summary exercise 8 page 149 


L i) 


ii) 


ii) 


it) 


ii) 


One-tail 5% test has critical region of just 0 - 
exact significance level is 4.3%. 

Two-tail 10% test has critical region of 0, | or 2 
(4.3% at the bottom tail) and 12 or higher (3.4% at 
the top tail) — so the exact significance level of the 
test is 7.7%. 

One-tail 2% test has critical region of just 0. 
Exact significance level is 1.1%. 

1.1% 


Type II error is when the null hypothesis is 
incorrectly accepted — here the probability of 
getting 0 (the only value in the critical region) is 
13.5% so P(Type II error) is 86.5%. 


‘The test is of H,: p = 0.95 vs H,: p < 0.95; 
P{B(20, 0.95) < 16} = 0.016 < 5% so you reject H, 
and conclude there is evidence at the 5% level to 
suggest that the council has overstated the level of 
satisfaction. 

Since H, was rejected, a Type I error could have 
been made. 

The test is of H,: p = 0.1 vs H,: p > 0.1, and 
P{B(20, 0.1) <4} = 0.957 so the critical region 
for the test is ‘at least 5 out of the 20 approve, 

so the test will accept that the proportion has 
increased. 


This still only represents + of the residents asked 
about the revised plans so it is hardly a resounding 
endorsement — just because it has increased the 
approval rate does not mean that enough people 
approve it for it to be a good idea to proceed. 


P{B(20, 2) < i} = 0.130 > 5% so do not reject H, 


and conclude that there is not sufficient evidence 
to suggest that ones appear less often than ona 
fair die. 

The critical region for the test is just 0, and the 
probability that this is seen when p does equal s is 


0.026, so the probability of a Type I error is 2.6%. 


A Type IJ error is to accept that ones do not 
appear less often than on a fair die when they do 
appear less often. 


i) 


it) 


a) 


b) 


°) 


d) 


a) 


P{Po(1.2) > 4} a= 0.034; P{Po(1.2) > 3} = 0.121 so 
the rejection region (=critical region) is 4 or higher. 
Type II error is to accept the null hypothesis 
incorrectly - accepting H, here is if no more than 
3 hurricanes are observed: P{Po(2.5) < 3} = 0.758, 
so there is a 75.8% chance of making a Type II 
error if the mean number of hurricanes has 
increased to 2.5 per year. 


25 25 
H,:A = Ns H,:A <8 3 or fewer accidents. 


A Type [error in this context would be to 
conclude that the mean rate was less than 
one every 3 days when it had not reduced. 
Probability of a Type I error is 0.0338. 
Accept H, and conclude that there is not 
sufficient evidence to suggest that the rate of 
accidents has reduced. 


0.735 
hen 

(22000, ™Po(4.4); 

H;:A=4.4 vs H:A<44. 


P(X =0)=0.0122; P(X $1)=0.054; critical region 
is 0 people with the disorder. 


b) 


°) 


d) 


a) 


b) 


©) 


Accept H, and conclude that there is not 
sufficient evidence to suggest that the rate of the 
blood disorder is lower in his home country. 
0.89 


With this sample size, there is an 89% chance 
that it would not detect a change in the rate of 
occurrence by a factor of 2. He would need a 
much larger sample size to be able to pick up 
important changes in the rate of occurrence. 


H,; p=0.15 vs H,p<0.15. Critical region is 
0 or 1 violent outbursts. 

Reject H, and conclude that there is evidence at 
the 5% level of significance to suggest that the 
rate of violent outbursts has reduced. 

If the disorder is rare, locating a larger random 
sample of patients may be difficult, and 
conducting the trial may be expensive, so she 
may not be able to conduct a larger-scale trial, at 
least in the first instance. 


Answers 


9 Hypothesis testing using the 


normal distribution 


Skills check page 152 


1. 


x= 63.8 


Exercise 9.1 page 156 


1. i) RejectH,. ii) AcceptH,, iii) Reject H,, 

2. i) RejectH,. ii) AcceptH,, iii) Reject H,, 

3. i) Comparing the first and second tests in each 
question: the population has a larger variance 
in the second one and the acceptance region 
includes more values - it is harder to be sure 
that an observed difference would not occur 
because of the randomness in the sampling 
process ~ and the same observed mean results 
in ii) being rejected where i) was accepted. 

ii) Comparing the second and third tests in each 
question: the sample size in the third one is 
larger and the acceptance region shrinks - with 
more data it is easier to be sure that the observed 
difference would not occur because of the 
randomness in the sampling process — and the 
test goes back to accepting the null hypothesis. 

4, Hy:/t= 351.2 vs H,:1# 351.2. Accept H, 
5. Reject H, 
6. Plan A: 

i) (82.5, 88.1) 

ii) 0.713 

Plan B: 

i) (84.2, 86.4) 

ii) 0.072 

iii) 49 

Exercise 9.2 page 159 
1. i) Accept H, and conclude there is not sufficient 


evidence at the 5% level to suggest that the mean 
is not 45. 


ii) Reject H, and conclude there is evidence at the 5% 
level to suggest that the mean is less than 57.5. 


iii) Accept H, and conclude there is not sufficient 
evidence at the 5% level to suggest that the mean 


has increased from —0.5. 


Answers 


2. i) Accept H, and conclude there is not sufficient 
evidence at the 5% level to suggest that the 
proportion is not 0.4. 

ii) Accept H, and conclude there is not sufficient 
evidence at the 1% level to suggest that the 
parameter is less than 27.1. 

iii) There is not sufficient evidence at the 5% level to 
suggest that the proportion has increased from 0.8. 

Note: if you calculated the observed proportion was 

<0.8, you could have drawn this conclusion without 

doing the full calculations. 

3. There is not sufficient evidence at the 5% level to 

suggest that the proportion is greater than 0.2. 

4, There is not sufficient evidence at the 1% level to 

suggest that the average rate of accidents has decreased. 

5. i) z= 52H = 1,867 = 32-740 

vn vn 
— 1.867%5.3 _ = 
N= 740 8.246 > n = 68 

ii) For a5% one-tail test the critical value is 1.645, 
so reject H, and conclude that there is evidence 
at the 5% level to suggest that the mean height 
has increased. 

6 a) i) 0.864 ii) 0.824 

b) Around holiday periods and birthdays, there are 
likely to be more letters than usual. 

©) Accept H, and conclude there is not sufficient 
evidence to suggest the mean has increased. 

Exercise 9.3 page 161 

1. a) AcceptH,. b) AcceptH,, ¢) Accept H,. 

2. i) (599.97, 601.63) 

ii) Does not support the claim that the mean 
contents are 602 ml. 

3. i) (247.6,249.4) 


ii) Does not support the claim that the mean 
contents are 250 g. 


Summary exercise 9 page 161 


1. 


i) While the decrease might get them into legal 
difficulties, an increase in the mean contents will 
cost them money. 

ii) There is evidence at the 5% level to suggest that 
the mean has changed. 


Reject H, and conclude that there is evidence at 
the 5% level to suggest that the pulse rates of 
trained athletes is lower than that of healthy 
young adults. 


i) To take measurements of a complete population 
can be very expensive and take a long time - a 
sample is cheaper and offers quicker access to a 
reasonable level of information. 


ii) Reject the claim at the 5% level of significance, 


Whaat the spokesperson says may be true — and 
you can construct scenarios which give plausible 
explanations why this may not be reflected 

in the customers’ experiences - for instance 

if they reduced very slightly the 40% most 
expensive holidays that very few people actually 
bought last year then they could make the 

claim accurately even if it presents a misleading 
picture to the customer. 


There is evidence at the 5% level to suggest that the 
average rate of accidents has decreased. 


There is not sufficient evidence at the 1% level to 
suggest that the average height has increased. 


a) Accept H, and conclude that there is not 
sufficient evidence to suggest that the mean 
length of pencils is greater than 14 cm. 


b) No: this is large sample so CLT allows normal to 
be used as an approximation. 


a) Hy: M=34.2 vs H,: <34.2, z= -1,90, so reject 
H, and conclude that there is evidence at the 
5% level of significance to suggest that the mean 
time for the procedure has reduced. 


b) Note: there is no single ‘right answer’ to this 
— the following is an illustration of a possible 
answer. 


‘The test was significant, suggesting that the 
mean time is lower, but on average it has reduced 
by 0.7 minutes, which is 2%. The hospital should 
consider cost, both the cost of the extra training 
and the potential savings of a 2% reduction in 
operating time, and consider how much other 
change the staff are experiencing — whether this 
is a reasonable burden if there are worthwhile 
savings to be made. 


X~Po( 365% 
30 


Hyn=482 vs Hy<482, 


z= ~2.03, so reject H, and conclude that there is 
evidence at the 5% level of significance to suggest that 
the number of accidents at that bend has reduced. 


a) B(500,0.43) "~ N(215,122.55), so 
Hy; M@=215 vs H:u>215. 


=2,57> 2.326, so reject H, and 


conclude that there is evidence at the 1% level 
of significance to suggest that there has been an 
increase in the number of people taking regular 
exercise in that area. 


b) A Type Terror in this context would be to 
conclude that there was an increase in people 
taking regular exercise when there had been no 
increase. Probability of Type I error is 5% — since 
it is a test using a continuous distribution, it is 
the same as the significance level. 


c) Ifthe aim of the advertising campaign is to have 
a long-term effect on the exercise behaviour of 
people it would be better to conduct the survey 
after a reasonable gap in time. A random sample 
of 500 people is large enough that the results 
should be fairly reliable if you can assume people 
answer truthfully. 


Answers 


192 


Exam-style paper A page 166 


(747.43, 748.57) 


1 


2. 


a) 


b) 


a) 
a) 
b) 


‘The probability of getting one head is 0.5 so not 
all nieces have an equal chance. 


She could put their names on pieces of paper 
and draw them from a hat; she could toss a coin 
twice and take niece 1 if she sees a head followed 
by a tail, and if she gets a tail followed by a head 
she repeats the process until she can choose a 
niece. [There are many other ways of generating 
3 equally likely events - any of which will do.] 
0.570 b) 224 seconds 
m=268; s?=1.49(3 sf.) 


Testing H,: 44 = 2.3 against H,: u # 2.3 
at the 5% level has acceptance region 


2.3£1.96 x .962, 2.638). Since 


50 
m=2.68 lies above this interval, there is 


sufficient evidence to reject H, and we conclude 
that there is evidence at the 5% level of 
significance that the mean has changed. 


Answers 


a) 
a) 


b) 
a) 


b) 
©) 


0.0328 b) 0.0912 ©) 0.360 
‘ 
[ilerdx}ax—[are te] =(044)-(0+0)=1 
lo 2 4 4 

3 
>a=— 

4 
Median = 7 (<0,562) ©) 0.542 


2 
In this situation a Type I error would be 
concluding that the generator was biased when 
in fact it was producing zeroes randomly with 
probability 0.1. P(Type I error) = 0.0702 


P(Type Il error) = 0.526 


(0.195, 0.383) 


Exam-style paper B page 168 ° 22) 2 2, _ Sok 
1. 101 6 a) f/kvtdt=| Shr ghia} 
2. 0.291 =p 
52 
3. Reject H, and conclude that there is sufficient 


evidence at the 5% level to suggest that the mean by) BUR seiss 
volume is less than 500 ml. ie ey ; 
4. If =amount paid to hire the car of jvfar-|3e] gt agr 
a) E(C)=$33.20; Standard dev C = $4.20 =zx 13=05 
b) E(C,)=$33.20; Standard dev (C,) = $1.48 4) 0.031 
i 3): 0,088 B) (Oo99 7. a) 0,348 b) 0.472 


c) If4 suffer the side effect in the test then the null 
hypothesis will not be rejected, so a Type II error 
is the only one possible. 


Answers (BEE) 


Glossary 


Command words 


calculate Work out from given facts, figures 
and information. 


describe Give the characteristics and main 


features. 

determine _ Establish with certainty. 

evaluate Judge or calculate the quality/ 
importance/amount/value of something. 

explain Set out purposes/reasons/mechanisms, 
or make the relationship between things clear, 
with supporting evidence. 

identify Name/select/recognise. 

justify Support a case with evidence/argument. 
show that Provide structured evidence that 
leads to a given result. 

sketch Make a simple freehand drawing 
showing the key features. 

state Express in clear terms. 


verify Confirm that a given statement/result 
is true. 


Mathematical words 

acceptance region ‘The region containing 

all the observations for which H,, the null 
hypothesis, fails to be rejected. 

alternative hypothesis, H, The proposition 
that your null hypothesis, H, is not true. 

bias An over- or underrepresentation of 
certain elements of a population in a sampling 
method. 

binomial distribution A distribution used to 
model how often one outcome occurs in a series 
of Bernoulli trials. 

census A technique that takes measurements 
of the entire population. 

cluster sampling A sampling method which 
reduces costs of sampling by choosing a number 
of groups (often geographical areas) first and 
then sampling within the group or cluster. 


Glossary 


confidence interval An interval giving a range 
within which you have a specified confidence 
that the population mean will lie. 

consistent (of an estimator) The property 
that, as the sample size increases, the probability 
that the estimator is close to the true value also 
increases. 

continuity correction A correction required 
when a continuous distribution (usually the 
normal) is used to approximate the binomial (or 
any other distribution that takes only discrete 
values). 

continuous random variable A random 
variable where the data can take any value within 
an interval. 

critical region ‘The region into which 
observations fall that cause you to reject the null 
hypothesis, H,. 

discrete random variable A random variable 
where the data can take a countable number of 
values. 


efficiency A measure of the quality of an 
estimator. 
elements Members of a population to be 
measured. 


estimate When data are summarised in 
intervals, details of exact values are not known 
and we can only work out approximations 
(estimates) to summary statistics like mean, 
median etc. 

expectation ‘The mean ofa probability 
distribution. 

frequency density ‘The average rate at which 
observations occur in an interval (= frequency / 
interval width) 

hypothesis test/testing A technique which 
compares observed data with the sampling 
distribution when the null hypothesis is true 
and looks for an alternative explanation if the 
observation would be classified as a ‘rare event’ 
under the null hypothesis. 


independent random variables Random 
variables for which knowledge of the value of 
one tells you nothing about the value of the 
other. 

interval estimation The technique which 
produces confidence intervals. 

mean The sum of all the values divided by the 
number of values. 

median The middle value when the values are 
arranged in order. 

mode The most commonly occurring value. 
natural variation ‘The differences in a 
population brought about by chance. 

normal distribution A continuous 
distribution that is symmetric and infinite 

in both directions; 95% of values lie within 2 
standard deviations of the mean and 99% lie 
within 3 standard deviations of the mean. 


null hypothesis, H, ‘The probability basis 
under which you consider a test. It provides a 
default statement that you will accept unless the 
observed data would rarely be seen if the null 
hypothesis is true. 

one-tail test A test where only very low or very 
high values are considered ‘extreme. 

point estimate A single value given as an 
estimate of a parameter of a population. 

Poisson distribution The probability 
distribution of the number of events occurring 
ina fixed interval of time or space when the 
events occur randomly, but the mean number of 
events is proportional to the size of the interval. 
population The larger group from which a 
sample is drawn. 

power of the test The function which gives 
the probability of a Type II error occurring for 

a range of values for the alternative hypothesis, 
ie. how good a test is at discriminating the null 
hypothesis, H,, from alternatives. 

probability density function The function 
f(x) which generates the probability of a 
continuous random variable lying in any interval 


by the definite integral over the interval, ie. 
P(a<X<B)= fitepde. It can be thought of 


as the rate at which the probability occurs at any 
value. 

probability distribution Describes all the 
possible outcomes in a situation along with the 
probability of each outcome, either in a list or by 
a formula. 

quota sampling A sampling approach where 
specific proportions of different groups have to 
be included. 

random sample A sample in which each 
member of the population has an equal chance 
of being selected, and all possible combinations 
are equally likely. 

random variable A quantity that can take any 
value determined by the outcome of a random 
event. 


recurrence relation An equation that defines a 
sequence based on a rule that gives one term as a 
function of the previous term(s). 

rejection region Another name for critical 
region. 

sample A group selected from the larger 
population which is used to make estimates of 
the population. 

sample mean The mean value of the samples 
in a data set, used as an unbiased estimator for 
the population mean. 

sample space A list (in curly brackets { }) of 
the compound event outcomes. Each outcome 
is in its own pair of brackets ( ). For example the 
sample space for tossing a coin twice is {(H, H); 
(H, T); (T, H); (T; T)}- 

sampling distribution of the statistic The 
probability distribution of the given statistic, e.g. 
the mean. 

sampling frame __ A list containing all the 

units or elements that are the members of the 
population. 

self-selection Where members of a group 
choose to participate, e.g. a radio phone-in. 


Glossary 


196 


significance level of atest The proportion to 
be classed as a ‘rare’ event in a hypothesis test 
(5% is the standard level). When a ‘rare’ event is 
observed, you decide to reject the null hypothesis 
in favour of the alternative hypothesis. 

simple random sampling The process of 
selecting samples randomly where each member 
of the population has an equal chance of being 
selected, and all possible combinations are 
equally likely. 

standard deviation The square root of the 
variance. 

standard error The standard deviation of the 
sampling distribution. 

statistic A property of the data set, e.g. mean, 
standard deviation, range. 

stratified sampling A sampling method where 
a population has distinct groups and the same 
proportion of each group is included in the 
sample by sampling randomly within each group. 


Glossary 


systematic sampling A non-random sampling 
technique that takes a number at random from 1 
to n and then takes every nth value thereafter. 
test statistic The observed statistic in a 
hypothesis test. 

two-tail test A test where you consider 
‘extreme values at both ends of the range as a 
rejection of the null hypothesis. 

Typelerror The null hypothesis is actually 
true but you reject it. 

Type Ilerror You accept the null hypothesis 
and it was not true. 

unbiased estimator For any parameter ‘on 
average’ an unbiased estimator will give the true 
value of the parameter. 

variance ‘The average of the squared distances 
from the mean. 

z-score Describes how many standard 
deviations a data point is from the mean. 


Index 


A 


acceptance region 132, 154 
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discrete random variables 23, 29 
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estimation 77, 106, 126 
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interval estimation 107-8 
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sum of repeated independent 
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frequency density 59 
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Golden Gate Bridge, San 
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hypothesis test for the mean of a 
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143, 146 
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‘Type I and Type I errors 131-2, 
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hypothesis test for the mean of a 
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hypothesis test for the mean using a 
large sample 157-60 
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independent random variables 47 
interval estimation 107-8 
investment 164-5 


L 


linear combination of Poisson and 

normal variables 46, 57 

distribution of the sum of two 
independent Poisson random 
variables 47-9 

linear functions and combinations 
of normal random 
variables 50-4 

linear combination of random 

variables 28 

comparing the sum of repeated 
independent observations 
with the multiple of a single 
observation 40-1 

expectation and variance of a 
sum of repeated independent 
observations of a random 
variable, and the mean of those 
observations 38-9 
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